Because FanGamb is a very data-intensive site (lots of constantly updating odds, games, results, etc.), much of the software powering the site doesn’t directly tie to the web interface and instead interacts with our database. Last winter, we had an issue where as usage increased on the site, resource usage increased faster than it should have been and systems started locking up.
Digging through the processes that were running, we finally noticed a confluence of issues. First, one of our data scripts was running away, loading a bunch of duplicated games. This wasn’t good, but the vendor was able to fix it easily. However, this first issue led to a second problem. As the number of games increased in the database, the script that processed these games kept running longer and longer to deal with the increasing number of dups. As game results update quite frequently, our cron jobs trigger in fairly close succession. What began to happen was as the first cron job took longer and longer to complete, the subsequent cron job would kick off before the first completed. So, we had cron job after cron job stacking up on the server, quickly leading to issues, as you would suspect.
The fix for this was quite simple, as well, and has since become a standard practice for us. There’s a utility that EngineYard (our host) pointed us to that implements “locking”, so that one task can’t kick off while the other is already operating – it’s called Lockrun. It uses a temporary file and system ‘flock’ing to implement this, so it’s incredibly simple to install. One little utility and a big issue solved – the best kind of solution.
If this is your cron job:
/usr/bin/php script.php > log.log
Using this utility, just change to:
/usr/bin/lockrun –lockfile=/data/path/JOBNAME.lockrun — sh -c “/usr/bin/php script.php > log.log”
Download the utility here: Lockrun