Scheduling content

An interesting coding problem: how do you schedule content on a reasonably substantial scale?

This might be a script you have to run constantly (normally fixable with a cron), or something that needs to be triggered after an arbitrary length of time – delayed blog posts, campaigns coming to an end etc.

This problem gets considerably more complex when you add a caching layer in, where you might have to expire the cache on several pieces of data simultaneously at a time when you can’t reasonably expect someone to hit a “clear cache” button.

Laravel 5 implements one solution to this: a cron that runs every minute triggering a script that checks a database for scripts to run. Entries in this database can be created in the code of your app.

For example, when a post is created in the database to be published in the future, another entry is created in the database to trigger a cache clear at a certain time. Each minute the cron checks the database for entries matching the current time, and when it encounters the relevant entry, clears the cache. Otherwise, when the list of articles is returned stale cache data might be returned instead of the accurate list.

There are plenty of alternatives to this – changing the length of time the data is cached for, for example, but I haven’t encountered any quite as simple.

There are of course drawbacks to this approach. In particular, how do you handle a system like this when you have a large number of items to schedule? What if you have an application where every user has an opportunity to add data which requires scheduling? If your cron script is hit every minute and on each occasion returns multiple scripts to run, it could potentially put a lot of pressure on the server.

I’d be interested to hear any alternatives to Laravel’s approach. I’m aware of node-schedule, which appears to take advantage of NodeJS’s structure to run scheduled tasks without resorting to crons. This very much relies on the way that Node is structured, so I’m curious as to whether there are any other solutions using Apache or nginx. If you know of any, please do post them in the comments.