MongoDB aggregation – why stages order really matters

MongoDB aggregation framework is very powerful tool. It can do much more than many relation databases features, but if we do not know or do not understand some simple rules, effects can be… a bit strange. Especially if we want to use it in PHP and build aggregation step by step, as array of commands. Great example is to use sorting or limiting, one small mistake and it will return false documents. Why? Because we can forget about commands order. 

When we use standard Eloquent, it’s not a problem to first set some conditions, then add sorting and then additional conditions, everything will be fine, because it will use simple find on Mongo: 

$query = MyModel::query()->where('foo', 'bar');
$query->limit(10);
$query->orderBy('name', 'desc');
$query->where('param1', 'val1');
$results = $query->get();

 
In that example, everything will be ok, because query builder will do all required things. But what if we will try to do the same thing using aggregation framework? We can do that in raw Mongo, but use another PHP and Laravel example, it will the same except small syntax changes: 

$aggregate = [];
$aggregate = [
	'$match' => [
		'foo' => 'bar'
	]
];
$aggregate = [
	'$limit' => 10
];
$aggregate = [
	'$sort' => ['name' => -1]
];
$aggregate = [
	'$match' => [
		'param1' => 'val1'
	]
];
        
$results = MyModel::raw(function (Builder $builder) use ($aggregate)) {
	return $builder->$aggregate;
};

In that case, results will be completely different. Why? Because when we use aggregation, command order matters. It will first filter documents and receive only these with field “foo” with “bar” value, then it will limit them to just 10, then sort and in the end, use additional filter by “param1” field. As you can see, limiter inside will cause, that a lot of documents will be skipped and not included in query. Its why command order is important – it allows us to make very complex queries, modify data, join collections in different way (using $lookup) but we should now, what we want to achieve.  

Ok, so, how should be prepared aggregation to work like simple query from first example? We need to change order. Also, we can simplify that by merge two $match stages into one: 

$aggregate = [];
$aggregate = [
	'$match' => [
		'foo' => 'bar',
		'param1' => 'val1'
	]
];
$aggregate = [
	'$sort' => ['name' => -1]
];
$aggregate = [
	'$limit' => 10
];
        
$results = MyModel::raw(function (Builder $builder) use ($aggregate)) {
	return $builder->$aggregate;
};

In that case, aggregation will first filter documents, then sort them and finally use limiter – no risk, that some documents meet our requirements will be skipped.  

Conditional query in Laravel Eloquent

Laravel Eloquent is very powerful and we can use it in many cases to get or filer data. One very common situation is to build query according to some statements and conditionals. The most common way to achieve that is to build id step by step: 

$onlyUnread = $request->only_unread ?? null;

$query = Article::query()->where('active', true);
if ($onlyUnread) {
    $query->where('unread', true);
}

$articles = $query->get();

Last time I checked some old documentation and found different method – when. It allows us to simplify that process a lot and include are conditionals and additional query steps in just one command: 

$articles = Article::query()->where('active', true)
    ->when($onlyUnread, function (Builder $query): Builder {
        return $query->where('unread', true);
    })
    ->get();

It’s much cleaner than build step by step, and because we still use Builder object, we can do exactly the same things like on higher level. Of course, it may be necessary to use some additional params – of course it is not a problem, we only must send them to function inside:

$articles = Article::query()->where('active', true)
    ->when($myCondition, function (Builder $query) use ($param1, $param2): Builder {
        return $query->where('unread', true)
            ->orWhere('param1', $param1)
            ->orWhere('param2', $param2);
    })
    ->get();

It is also possible to use when more than once – if you need set some steps with separate conditions, no problem. I do not know, why this is not present in current documentation, but when method is available in current Laravel version and can be used without additional steps.  

Goals vs systems

This is one of the usual posts on my blog and you will find some strong, vulgar words inside – it was necessary, because I’ve come a long way and want to name certain things appropriately, without any mercy. Do you have any goals? Are you goal-oriented? I was such a man for a few years, and I can’t recommend such approach. It may look like good motivation, but it’s destructive and stupid, in most of cases, it’s just coaching bullshit. Right now, I change my life from goal-oriented to systems-oriented, I will explain difference and write, why goals are bad option. 

It’s difficult to not hear about goals nowadays – a lot of people discuss about success, about scaling, about self-improvement. One of the core elements of such approach are goals: you must have goals and you must do anything to achieve them. For example, attractive shape, fit. According to many people, you should set your weight target and then train and eat to archive that goal. Or your own business: you SHOULD grow up, earn more, hire more employees, produce more, and more, rise your market, scale up every time. It looks fine seemingly, but there is a big problem – what after that?

Rat race

Last years I have archived a lot from my assumed goals. And guess what – almost every time, I felt happy for only a moment, and then I felt a void – no more goal, no more sense… Solution is simple and coach gurus can tell you: find another goal to achieve and continue your adventure! But it’s stupid. It’s like rat race, never-ending story with only few moments of happiness and rest of enormous, debilitating effort. It’s not hard to work hard, to work a lot, but make it without good sense: for example, waste time for meetings, for pep talks (I really hate them), but without real vision, without knowledge of what we want to do and what our values are. The tricky thing it to work smart, to say STOP in some moments and decide we are not interested in scale up, because… it will break our rules and we are happy now. 

With systems, values are more important. Back to fit example from the beginning: weight target is bad, because after months of many austerities you will be happy for only moment, and you will return to the previous weight quickly. It will need a lot of willpower and will “drain” you, like bad app can drain your phone battery. All changes you made were only temporary, only to achieve that goal. Then you will try again… and again… and again. With the same result. And it’s a definition of insanity: “Insanity is doing the exact… same fucking thing… over and over again expecting… shit to change… That. Is. Crazy.”. With system, you do not need austerities. You can decide I want to be healthier, and then, step by step, change your habits and nutrition – still tasty, but healthier than before. You do not have to plan finish an Iron Man, but you want to be quite active every day: because you know, that good sleep + healthy food + some activity = you will feel great most of time. Most, not only few minutes after each stupid goal! What an amazing difference!

With systems, it’s not important to push extremely hard. It doesn’t make any sense. Example from my life: I can run fast, but when I increase my pace, I also increase chance to injury. It doesn’t matter, because benefits from higher pace are small and from my perspective, irrelevant: I do not have goal to win something, I want to be healthier to have more power for my family, to enjoy life so… As you can see, it’s simple. Amazingly simple and I love simple things a lot. The hardest thing is to change a way of thinking. During years, a lot of us has been infected by toxic thinking, that only achieved goals can give you happiness. If effect, lot of us uses social medias “to be famous”, to show different people, what we achieved, to measure our progresses and compete with others (like Strava). But all of that it’s bullshit, just answer that question: what will you do after your current goal? Will you set a new one? Just think about that. 

Goals are for losers

With systems, you do not have to abandon all your goals. They can remain, they can build your systems, but you must remember, to not treat them as process determinants, milestones etc. If you will fail some goal – no problem, you can learn a lot from that lesson and it’s always solution where everyone benefits. I think it’s hard to separate these two things – I can blame myself easily after non-achieved goal, so in such cases, it’s better way to avoid them. And it’s the reason I still not full decided to use or to not use some services like Strava. They have their advantages, but “hidden defects” are particularly important in most of cases. Life is too short and too beautiful to waste time for such defects.

To be honest, I do not mean to stop, sit down, do nothing, and say: “I’m happy now without that rat race!”. No, it’s not an option, not for people like me, who like challenges. I urge to change life fundamentals, to build few systems based on our values and then use them to be happier. If you think you can be happy because of money, you are wrong: new jacket/car/home/TV/plane will be great for few days, but after that, you will lose a sense of novelty. If you think, achieved goals and “sucess” (whatever it is) will give you better life, you are wrong. Simple things can give you a lot of happiness, and you can use systems to make your life longer and simpler, to have more time for happiness instead of mindlessly, crazy forcing forward. It’s your life, don’t fuck this up.

Laravel & MongoDB – relationships with ObjectId

I’ve used Laravel and jenssegers/laravel-mongodb package for a long time. It’s wonderful thing, because it allows to use MongoDB easily, integrate most of things with Eloquent and Laravel Models, but also offers hybrid relationships with different database drivers. In theory it supports all relationships in MongoDB database, but in practise, there are a lot of potential issues because of current implementation. 

On MonogDB you can use many fields to create relations and use them in $lookups (aggregations) after that, but the most common practise is to use ObjectId – it’s default field type for keys (_id) and of course you can use different, but probably you will use that one. The problem is jenssegers package uses string for all relationships – and it works (but not always), until you want to use custom aggregations and lookups. These is an example of our parent model with some child relationship:

use Jenssegers\Mongodb\Relations\HasMany;

class ParentModel 
{
    public function children(): HasMany
    {
        return $this->hasMany(ChildModel::class, 'parent_model_id');
    }

}

And then you want to add some children to parent model instance:

$child = new ChildModel();
$parentModel->children()->save($child);

What will be the effect? Child record will have parent_model_id field of course, but it will be string field, not ObjectId. It’s fine until you will need to use $lookup – in that case, simple join will NOT work, because you have ObjectId key (_id) in Parent collection and string relation field on Child collection. We will need to prepare additional field using $addFields in Parrent collection before $lookup stage or in Child collection during $lookup pipeline. It’s not efficient way to solve that problem.

So, how to handle with that? Solution is easy and already available on new package development version: skip automatic casting key to string. Unfortunately, looks like it is not maintained actively, and we will wait a bit longer for updated version. But we can add required change right now. Just overwrite ParentModel getIdAttribute to fix that and always return clean value, without any modification:

class ParentModel
{
    public function getIdAttribute($value = null)
    {
        return $value;
    }
}

After that change code to add child:

$parentModel->children()->save($child);

will not use string in parent_model_id field anymore. It will be ObjectId and everything will work correctly: build-in Laravel relationships and also $lookups, without any additional fields.

Any drawbacks? Yes, after such change, you need to remember to cast key if you want to send it to client as a string or use in comparisons:

$parentModel->getKey() === (string)$parentModel->getKey() 
// before change: true
// after change: false, because left is ObjectId instance

// JSON resource:

return [
    'id' => (string) $parentModel->getKey(),

    // or
    'id' => $parentModel->getKey()->toString(),
]

But I think it’s not a big deal – it’s better to always have one type instead of many different in the same field.