PhpStan: improve your code quality

Manual code review is a hard task and everyone who tired that, probably will agree – it is because this method requires time, involves even several people and do not provide success, because anyone can just complain and do not agree. Fortuantely we can use additional tools like static code analyzers: they can check our codebase and inform us about any issues to avoid bugs before we will push something to production. In this post I would like to briefly describe PhpStan – one of great PHP static analyzers. It open source, completely free tool we can use to improve our codebase quality.

Continue reading “PhpStan: improve your code quality”

How to securely store JWT tokens

Last time I worked with some JWT tokens and had to resolve some issues related to security. Right now a lot of web applications and not only, because also mobile apps use these tokens to prove that we are valid and correctly logged-in users. The idea of JWT is clear, right and useful, but there are some dangerous traps during implementation on web applications. The main question is: where should we store JWT tokens to make them secure and still usable?

localStorage and sessionStorage are not good

On mobile clients, it is not a problem, because we can save them in some secure area. The thing is a bit complicated with web applications. If you already thought about this question and looked for some solutions, you probably found a lot of articles about that and some of them recommend using localStorage. During the latest years web developers got a lot of new abilities and localStorage is one of them: simple and very quick to use, it’s just something like:

// Write
localStorage.setItem('foo', 'bar')

// Read
cost myValue = localStorage.get('foo')

The big additional advantage: it is persistent storage, so even if the user closes the browser and opens again, it will still be logged in. Great! So, why not, why not use it just to store our JWT token? The problem with this approach is the possibility of stealing a token using an XSS attack. XSS is an injection and execution of malicious code on a user web browser. If there is any possibility that someone will do that on your web application, then it will be possible to just execute code to steal a JWT token – because it is accessible from JavaScript level. 

Of course, if we allow anyone to make an XSS attack, we have bigger problems… but think about that: in a big, complex web application, can you guarantee that there is no such possibility? We can not think in such a way and just assume that everything will be ok. The better approach is to think that everything will be bad. It is a reason why we do not have to use localStorage for JWT tokens or any sensitive data. 

What about sessionStorage? It is very similar: provides a quick, easy way to store some data, but it is not persistent as localStorage. If the user open a new tab or just close and reopen the browser, there will be no data we saved. It may be good for let say banking web applications, but for most others, it will be a very bad user experience. Also, sessionStorage is still vulnerable to XSS attacks. Maybe store them just in memory? It may be a solution, but you will have additional issues with persistence, and it will be very difficult to resolve. Conclusion: such storage are not a good option. 

Solution: use old school cookies

Yep, cookies come to rescue. Many people think that is old school, not used anymore. But in reality, they are still very good options in such cases. Why? First: they offer persistence. When we set cookies, we can also control expiration time. They are available in next tabs with the same web application, also after reloading and browser restart. Second: if configured correctly, they are not vulnerable for XSS attack. We can set the httpOnly flag on the cookie and after that it will not be visible from JavaScript level anymore. Wait a moment… you can ask: if it will not be available, how can we use a token from such a cookie in the authorization header?

The answer is: it will not be possible, but it is not a problem, because browsers will attach these cookies in each request and your API should just check cookies. In reality, cookies are also header, just different and process is automatic, browser will do that for you without any additional code. So, if the user has been authenticated and the API has set a proper cookie with JWT token, it will be automatically sent back, even in background (XHR, fetch) requests. So your frontend application does not have to handle that anymore.

Third: if configured correctly, it will be limited to only secure connections. Right now, all apps with public access should use secure, encrypted connections. Using a secure flag you can define that cookie should be used only on such connections, so if anyone will try to use your application on standard HTTP, such cookie will be not available and not used. Of course, you should still use different methods to provide HTTPS-only traffic: upgrade / rewrite and also send HSTS headers to inform browsers about rules for your websites. 

What about CSRF/XSRF issues?

We decided to move all things to cookies, our JWT token is not vulnerable to XSS attack, and we are happy… But just for a moment, because we remembered about different attack surfaces related to cookies: CSRF. Our API sets cookies and it looks fine, but an attacker can right now just prepare a special website and command our clients to send unauthorized requests in the background. Browsers will automatically add cookies in such requests as mentioned above, so it looks like the biggest advantage is also the biggest drawback. What can you do in such a situation? The first thing is to provide strict CSRF protection: send a special token from API to the browser, then attach this token to requests and validate on API: it guarantees that only the real client (i.e. our web application) makes all requests. The problem with that option is maybe we will need to change a lot of places in our code, also handling background requests will be not easy. 

Fortunately, there is a second solution, much simpler. We can add another flag to our cookies: sameSite. This option allows us to block sending cookies cross-site. Default option is “None” and it does not block anything, cross-site is possible (also with CSRF possibility). The second option is “Lax” – with such configuration browsers will not send cookies for images and frames, so it will prevent CSRF attacks. If the user clicks the URL to our website on a completely different website (or email), cookies will be added, so everything will work correctly. The latest option is “Strict”: in that case cross-site cookies are completely blocked, the most secure, but in some cases a bit limited option.

Conclusion

Use cookies to store JWT tokens – always secure, always httpOnly, and with the proper same site flag. This configuration will secure your client’s data, it will prevent XSS and CSRF attack and also should simplify web application, because you do not have to care about using tokens manually on frontend code anymore. 

Set-Cookie: jwt=OUR_TOKEN_CONTENT; secure; httpOnly; sameSite=Lax;

What about refresh tokens?

Exactly the same thing – they are also super important because they allow users to generate new JWT. If you use this token to determine: is a user logged in or not, you can stop doing that. Just save such information in localStorage (simple bool) if you need. It is not a problem: if the token expires, your API will inform the frontend about unauthorized requests, and then you can call the proper service to refresh the token. If the refresh token is expired or invalid, this service will inform you about that, so you will know that the user should be redirected to the login page. Yep, all these things without using tokens manually, it will be completely transparent from web application level.

Authorization service on different domain

Bonus case: what if our authorization service is on a completely different domain or subdomain? Let’s say we use oauth.mydomain to display and handle login pages, but our users will use different websites like app.mydomain. In this scenario, cookies will not work correctly, so what can we do? There are two options. First is to move login into the app subdomain or just the main domain (then cookies will be available on all subdomains – you can still control that) – may in many cases it may be not possible or just very, very complicated because we want to support many systems. Second option: just provide a proper URL on your app and use it like a “middleware” to call authorization service, so it will be Web App ⇔ API ⇔ Authorization Service. If oauth respond with correct data, you can set proper cookies for your current domain. 

MongoDB aggregation – why stages order really matters

MongoDB aggregation framework is very powerful tool. It can do much more than many relation databases features, but if we do not know or do not understand some simple rules, effects can be… a bit strange. Especially if we want to use it in PHP and build aggregation step by step, as array of commands. Great example is to use sorting or limiting, one small mistake and it will return false documents. Why? Because we can forget about commands order. 

When we use standard Eloquent, it’s not a problem to first set some conditions, then add sorting and then additional conditions, everything will be fine, because it will use simple find on Mongo: 

$query = MyModel::query()->where('foo', 'bar');
$query->limit(10);
$query->orderBy('name', 'desc');
$query->where('param1', 'val1');
$results = $query->get();

 
In that example, everything will be ok, because query builder will do all required things. But what if we will try to do the same thing using aggregation framework? We can do that in raw Mongo, but use another PHP and Laravel example, it will the same except small syntax changes: 

$aggregate = [];
$aggregate = [
	'$match' => [
		'foo' => 'bar'
	]
];
$aggregate = [
	'$limit' => 10
];
$aggregate = [
	'$sort' => ['name' => -1]
];
$aggregate = [
	'$match' => [
		'param1' => 'val1'
	]
];
        
$results = MyModel::raw(function (Builder $builder) use ($aggregate)) {
	return $builder->$aggregate;
};

In that case, results will be completely different. Why? Because when we use aggregation, command order matters. It will first filter documents and receive only these with field “foo” with “bar” value, then it will limit them to just 10, then sort and in the end, use additional filter by “param1” field. As you can see, limiter inside will cause, that a lot of documents will be skipped and not included in query. Its why command order is important – it allows us to make very complex queries, modify data, join collections in different way (using $lookup) but we should now, what we want to achieve.  

Ok, so, how should be prepared aggregation to work like simple query from first example? We need to change order. Also, we can simplify that by merge two $match stages into one: 

$aggregate = [];
$aggregate = [
	'$match' => [
		'foo' => 'bar',
		'param1' => 'val1'
	]
];
$aggregate = [
	'$sort' => ['name' => -1]
];
$aggregate = [
	'$limit' => 10
];
        
$results = MyModel::raw(function (Builder $builder) use ($aggregate)) {
	return $builder->$aggregate;
};

In that case, aggregation will first filter documents, then sort them and finally use limiter – no risk, that some documents meet our requirements will be skipped.  

Conditional query in Laravel Eloquent

Laravel Eloquent is very powerful and we can use it in many cases to get or filer data. One very common situation is to build query according to some statements and conditionals. The most common way to achieve that is to build id step by step: 

$onlyUnread = $request->only_unread ?? null;

$query = Article::query()->where('active', true);
if ($onlyUnread) {
    $query->where('unread', true);
}

$articles = $query->get();

Last time I checked some old documentation and found different method – when. It allows us to simplify that process a lot and include are conditionals and additional query steps in just one command: 

$articles = Article::query()->where('active', true)
    ->when($onlyUnread, function (Builder $query): Builder {
        return $query->where('unread', true);
    })
    ->get();

It’s much cleaner than build step by step, and because we still use Builder object, we can do exactly the same things like on higher level. Of course, it may be necessary to use some additional params – of course it is not a problem, we only must send them to function inside:

$articles = Article::query()->where('active', true)
    ->when($myCondition, function (Builder $query) use ($param1, $param2): Builder {
        return $query->where('unread', true)
            ->orWhere('param1', $param1)
            ->orWhere('param2', $param2);
    })
    ->get();

It is also possible to use when more than once – if you need set some steps with separate conditions, no problem. I do not know, why this is not present in current documentation, but when method is available in current Laravel version and can be used without additional steps.