5 ways to write Laravel code that scales
Well hello there, Laravel News reader, it’s Jack Ellis & Paul Jarvis, the founders of Fathom Analytics. Before we dive into the goods, allow us to introduce ourselves. We run Fathom Analytics, a simple, privacy-focused analytics platform used by forward-thinking website owners who care about their visitors’ privacy. Our application is built with Laravel and deployed on Laravel Vapor. Jack is also the creator of Serverless Laravel, a course for mastering Laravel Vapor, and Paul is also the author of Company of One, a book that questions traditional business growth in search of a better definition of success. Together, we make up the Fathom Analytics team.
Fathom Analytics is used extensively throughout the Laravel community. Some of our fantastic Laravel customers include:
- Matt Stauffer (Partner at Tighten)
- James Brooks (Developer at Laravel LLC & Happy Dev FM Host)
- Dries Vints (Developer at Laravel LLC & Founder of Laravel.io)
- Jack McDade (Creator of Statamic)
- Justin Jackson (Cofounder of Transistor)
- Stefan Bauer (Founder of PingPing)
And many others.
The following post is not us selling Fathom. Instead, it aims to help you be a better Laravel developer. Our only plug: If you ever need simple analytics or know someone who does, give Fathom Analytics a try.
So now that the introduction is done, I’m (Jack) going to go over some code tips for scaling. I’m going to be focusing on the code, not the infrastructure.
Be prepared for database downtime
When databases go offline, application front ends typically follow, because apps often can’t live without the database. But what happens behind the scenes? Whilst you’re replying to angry tweets, your queue workers are still working away, getting nowhere and potentially losing all of your job data.
When we write jobs, we need to understand that they’re sometimes going to fail. We’re not mad about this, we understand it’s the nature of a job. Imagine we’re using Redis for our queue, because we want something highly scalable, we set our worker up:
php artisan queue:work redis —tries=3 —delay=3
Everything is running beautifully. Our jobs are queuing up fast, thanks to super-low latency from Redis, and our users love us (no angry tweets in sight!).
But we would be silly to assume that our database is always going to be available.
Imagine that it goes offline for 20 minutes… what happens to our jobs? They continue to run since Redis is still online. And if we’ve not touched the default configuration, they’ll retry after 90 seconds and, based on the code above, there’ll be 3 attempts. After those attempts, the failed jobs go into the failed_jobs table in our database. Wait, hold on, our database is offline… so the jobs can’t be inserted into the failed_jobs table.
Here’s what we can do to prevent this:
try { // Check to see if the database is online DB::connection()->getPdo(); } catch (\Exception $e) { // Push it back onto the Redis queue for 20 mins $this->release(1200); }
With this piece of code, we can run it inside some job middleware or add it to the start of a job. At the risk of being called Captain Obvious, let me explain what it does. Before it does anything in the job, it checks to make sure the database connection is online. If it’s not, it releases the job for an explicit amount of time (20 minutes). If you’re set-up to try your jobs 3 times, that’ll get you around 40 minutes in the first 2 attempts. If your database isn’t back online within that timeframe then, crikey, you have bigger problems.
Now, you might decide that having a 20-minute delay is stupid. Calm down, I have another approach. Set your tries up to something higher:
php artisan queue:work redis --tries=15 --delay=3
And then roll with this code:
try { // Check to see if the database is online DB::connection()->getPdo(); } catch (\Exception $e) { if ($this->attempts() <= 13) { $this->release(60); } else { $this->release(1200); } }
With this, you get the best of both worlds. The first 13 attempts lead to a 60-second delay, which is great if your database had a tiny blip and was offline for 20ms, since your job will be completed much sooner, and you also have the 20-minute delay for when your database has been offline for 15 minutes or longer. This isn’t production code, this is just a concept for this lovely Laravel News article, but this can be modified, tested & implemented beautifully. So give it a go.
Assume all external services will go offline at some point
Developers can be complacent sometimes, can’t we? Throw off a job to the queue, it’ll be fine. Check the cache, it’ll be online. But what happens when these pieces are offline? Sure, if you’re running these things inside of jobs, the jobs will fail / retry and you’ll live to code another day. But if you’re queuing up jobs or checking cache when the user makes an HTTP request to your application, it’ll be the end of the world as we know it and everybody will hurt. But we can be shiny happy people if we use the following technique:
// Adding Fault tolerance retry(20, function() use ($request) { dispatch(new JobThatUsesTheRequest($request)); }, 200);
The beauty here is that we retry the queueing of the job 20 times, with a 200ms delay between each attempt. This is a great way to absorb any temporary downtime from your queue. Yes, it increases the response time for the user but, guess what, the request gets fulfilled, so who’s the victim?
Whilst the above works great with high-availability, fully managed queues such as SQS, what do you do when you have low-availability queues? Ideally, you shouldn’t. If your boss or client won’t let you spend more money to get a high-availability queue solution, here’s some code that’ll help with that:
try { retry(20, function() use ($request) { dispatch(new JobThatUsesTheRequest($request)); }, 200); } catch (\Exception $e) { Mail::raw('Error with low-availability queue, increase budget please', function ($message) { $message->to('yourboss@yourcompany.com'); $message->subject('Look what you did'); }); }
Well, that’s what I’d do đ
Use a faster session driver
One of the things that I see in a lot of applications is people using the default session driver or their database. That’s fine at a small scale but it’s not going to deliver the best results at scale. A better option would be to use an in-memory store like Redis.
Before you do anything, get Redis set-up, grab the connection details and set the appropriate environment variables (that’s all you get here, this isn’t an “adding Redis to Laravel” guide :P).
Once that’s all set-up and ready to go, open up config/session.php
and scroll down to the Redis section. Copy the default entry and change the key to ‘session’. And then change the database
value to env(‘REDIS_SESSION_DB’, 3)
and add an environment variable for it. The Redis area should look something like this:
'redis' => [ 'client' => env('REDIS_CLIENT', 'predis'), 'options' => [ 'cluster' => env('REDIS_CLUSTER', 'redis'), 'prefix' => env('REDIS_PREFIX', Str::slug(env('APP_NAME', 'laravel'), '_').'_database_'), ], 'default' => [ 'url' => env('REDIS_URL'), 'host' => env('REDIS_HOST', '127.0.0.1'), 'password' => env('REDIS_PASSWORD', null), 'port' => env('REDIS_PORT', 6379), 'database' => env('REDIS_DB', 0), ], 'cache' => [ 'url' => env('REDIS_URL'), 'host' => env('REDIS_HOST', '127.0.0.1'), 'password' => env('REDIS_PASSWORD', null), 'port' => env('REDIS_PORT', 6379), 'database' => env('REDIS_CACHE_DB', 1), ], 'session' => [ 'url' => env('REDIS_URL'), 'host' => env('REDIS_HOST', '127.0.0.1'), 'password' => env('REDIS_PASSWORD', null), 'port' => env('REDIS_PORT', 6379), 'database' => env('REDIS_SESSION_DB', 3), ], ],
Now you want to make sure you have the following variables in your .env file:
- SESSION_DRIVER=redis
- SESSION_DRIVER=session
And you’ll be ready to rock & roll. Response times go down tremendously. You’re welcome, friend.
Don’t waste queries on pointless stuff
Let’s look at something that doesn’t matter much at low scale but starts to matter more as you grow: caching your queries. Most people already do this, which is fantastic, but a lot of people don’t. Some people cache the wrong stuff, and others get it just right. Others run into all sorts of stale cache issues, and they can spend hours debugging problems caused by cache. Heck, we’ve all been there.
So what can we do if we want to live a happy life where we use our resources efficiently?
Cache static data
If you have a table in your database, something like Countries, which is seldom going to be updated, you can cache that without any stale cache drama.
$countries = Cache::remember(‘countries:all’, 86400, function() { return Country::orderBy(‘name’, ‘asc’)->get(); });
And I’d typically go for 24 hours. Whilst there aren’t many new countries popping up each day, there’s still a chance a country may rename itself, etc. If we’re being realistic, you could also cache it for a week. But why don’t we use rememberForever? We could. I just prefer to set Redis’ eviction policy to a “lru option” (this isn’t a Redis lesson, so we stop here!).
Cache dynamic data
Back in the early, early days, a lot of us stayed away from caching user objects and other pieces. “What if the user changes their email and it’s wrong in the cache?”. God forbid. But it doesn’t have to be like this. If we take responsibility for keeping our cache fresh, there’s no issue. In Fathom Analytics, we use caching extensively for dynamic data, and we use observers to make sure that Cache is kept up to date.
We use functions such as Site::loadFromCache($id) and then, whenever the site changes, we make sure we call Site::updateCache($id, $site). And, of course, we also use Site::deleteFromCache($id). You can only imagine the database calls we save ourselves, allowing us to never worry about database load.
This can also be really beneficial for updates to the database. Instead of doing a findOrFail on a model, you can just check the cache and then run the update. When you’re handling 100 updates, the effects of this change are negligible, but once you get into the hundreds of thousands to millions, it can make a big difference.
Do less in your commands
Final one, I promise. Also, hey, you’ve read 1,500 words of my ramblings, I appreciate it. I’d love to hear from you, and so would Paul, we’re @jackellis and @pjrvs on Twitter. Even if you hated this article, tell us how much you hated it. You know what they say: any press is good press.
One of the things I’ve seen a lot of people do is try to do too much in their commands. For example, they might make their commands process & send emails whenever they’re executed, or they’ll perform some logic on various data. This is fine at small scale, but what happens when your data increases or you start sending our many more emails? Your commands will timeout. You need to break up your background tasks. Please. If you won’t do it for yourself, do it for us.
When using commands, you should use them to dispatch jobs. Jobs can be dispatched in a matter of milliseconds, and the processing can be done in isolation. This means your command won’t timeout or reach some silly memory limit. And yes, this isn’t always the case, but it’s relevant when you’re working with data loads that will scale.
$userChunks->each(function ($users) { SendUsersAnEmail::dispatch($users); });
By doing this, we break up our workload. We can break our users up into chunks of 100 and have our jobs handle emailing them. Imagine if we’re doing this with 500,000 users, we’ve moved from processing all 500,000 in a single command to handling it between 5,000 single jobs. Much nicer. We could do more than 100 users in a job, obviously, this is just an example.
And as my favourite bunny once said… that’s all folks. If you have any questions, you can always tweet us.
And now that we’re all done, I’ll re-plug a few things:
***
Many thanks to Fathom Analytics for sponsoring Laravel News this week.
Filed in: News