Mastering Laravel Horizon's Unique Jobs
Unique jobs are great for ensuring that a job is never dispatched twice, but when misconfigured it can cause strange issues such as your job not dispatching. Most jobs in Vigilant are unique, for example, the uptime monitor dispatches a job at a specified interval but if for any reason the job hasn't executed within that interval I do not want another job on the queue.
But they can be tricky, there are a few things that can cause them to break.
Unique jobs work by storing a unique lock key in your cache, as long as that key exists a new job will not be dispatched. Normally the key gets created when the job is dispatched and removed when a job finishes. It does not matter if your job completes successfully or fails, the lock will always be removed by Laravel.
This unique lock key does not have a TTL set, which means that your job must run. If it doesn't run, your job will not be able to dispatch again. This can create strange situations as you expect your job to run when it dispatches but a hidden unique lock that blocks it.
So why does this happen? What makes a job not execute? There are four things:
Job manually removed from Redis, but lock remains
Horizon trims the job, leaving the lock behind
Horizon process is killed mid-job
Redis eviction removes job but not the lock
Let's go through these four and see why they can break your unique jobs.
Job manually removed from Redis, but lock remains
The first potential issue is if you'd remove the job manually from Redis. You could flush the Redis database and the job will be gone, if your cache driver is not in the same Redis database the unique lock will still exist. It's fine to clear Redis with your jobs but do not forget to also clear your application's cache.
Horizon trims the job, leaving the lock behind
Another issue is Horizon's trim. Ever had it happen that your queue was full, Horizon showed you a wait time of a few hours and suddenly it's done? Then when you check the results it seems that not all jobs have run?
Horizon's trim setting may have removed your jobs from Redis.
You can find the trim settings in your Horizon configuration file:
/*
|--------------------------------------------------------------------------
| Job Trimming Times
|--------------------------------------------------------------------------
|
| Here you can configure for how long (in minutes) you desire Horizon to
| persist the recent and failed jobs. Typically, recent jobs are kept
| for one hour while all failed jobs are stored for an entire week.
|
*/
'trim' => [
'recent' => 60,
'pending' => 60,
'completed' => 60,
'recent_failed' => 10080, //10080 = 1 week
'failed' => 10080,
'monitored' => 10080,
],
These trims are given as the TTL in Redis, that means that if you set the pending job trim to 60 it will remove the job from the pending jobs if it hasn't been picked up in 60 minutes. But because it expires in Redis it will not remove the unique lock. You should configure the trims of the pending jobs at least as long as you expect a job to wait in the queue.
Horizon process is killed mid-job
The third way this happens is when the Horizon process gets killed. When you call php artisan horizon:terminate
it will send a SIGTERM command to the Horizon process. Horizon's worker will check for signals before picking up a job. This way it first finishes the currently processing jobs and only then stops the process. But if the Horizon process gets killed while processing a job it will never be able to finish and remove the unique lock. You can easily try this, dispatch a unique job with a sleep and terminate the Horizon process. The unique lock should still exist in cache.
While this may seem unlikely, it can occur depending on how Horizon is deployed. This depends on how you're running Horizon, for example with supervisor there is a setting how long supervisor should wait before forcefully terminating the process. If this timeout is shorter than your job it will forcefully terminate Horizon. This is the configuration that I usually add for supervisor, the stopwaitsecs is the most important which is set at 15 minutes but should be configured based on your slowest job:
stopwaitsecs=900
stopsignal=SIGTERM
stopasgroup=true
killasgroup=true
Redis eviction removes job but not the lock
The last thing that can happen that causes jobs to not run is Redis being misconfigured. Depending on your configuration it will remove the oldest items when the database is full.
As Horizon is built on Redis you cannot skip this configuration. There are two settings that are important:
maxmemory
maxmemory-policy
The maxmemory option should be obvious, Redis should have enough memory to store your jobs. But the more important option of the two is the policy. What should Redis do if the memory is full? These are the options:
# volatile-lru -> Evict using approximated LRU, only keys with an expire set.
# allkeys-lru -> Evict any key using approximated LRU.
# volatile-lfu -> Evict using approximated LFU, only keys with an expire set.
# allkeys-lfu -> Evict any key using approximated LFU.
# volatile-random -> Remove a random key having an expire set.
# allkeys-random -> Remove a random key, any key.
# volatile-ttl -> Remove the key with the nearest expire time (minor TTL)
# noeviction -> Don't evict anything, just return an error on write operations.
When working with jobs we never want to remove any. All jobs that are pushed onto the queue should be picked up by Horizon. With this knowledge the only way we can guarantee that all jobs will be picked up is by setting the policy to noeviction. This means that the part of your application that dispatches the job will get an exception when Redis is full. This is preferable as the alternative is that a random job will be removed and with an exception you get alerted when Redis is full.
While the no eviction policy is the best for queues, it's not the best for cache. This is why I always run two Redis instances, one for the queue and one for cache.
Final Thoughts
Working with unique jobs is easy once you've configured your application and Redis correctly. If not you might find jobs that are not dispatching which can be a pain to trace as it isn't instantly obvious why.
The key is understanding that a unique job must run once dispatched, or the lock remains and blocks future attempts. Whether it’s Horizon trimming jobs too aggressively, Redis evicting jobs, or a process being killed mid-run. Each of these can leave the lock behind and halt job flow.
To avoid surprises:
Make sure your Redis eviction policy is
noeviction
.Run separate Redis instances for queue and cache.
Set your Horizon trim settings conservatively.
Gracefully manage process shutdowns, especially with long-running jobs.
Start Monitoring within minutes.
Get started with Vigilant in a few minutes, sign up, enter your website and select your monitors.
Vigilant comes with sensible defaults and requires minimal configuration.