Troubleshooting Common Problems with Cron Jobs at Engine Yard

Cron jobs are a basic unix tool used to run specific commands at specific times. This can be anything from deleting files to starting a script that processes payments in your application at specific times without having to remember to start the script manually. The most common questions we receive about cron jobs are: verifying the time at which a cron job is supposed to run, environment and path issues while running rake tasks, and unexpected cron output. Here are examples and solutions to some of these common cron problems.

Timing

The most common question we receive is how to verify the time that a cron job is supposed to run. An important first step in that process is to verify the time zone the server is currently set to. Cron runs based on the system time. Our servers default to UTC but some of our older servers are running on Pacific Time so you need to ensure you have the correct time zone. You can verify this by either checking /etc/localtime or typing date.

The five scheduling positions are: minute ( 0 - 59 ), hour ( 0 - 23 ), day of the month ( 1 - 31 ), month ( 1 - 12 ), day of the week ( 0 - 6 with Sunday being 0 ). A short hand for this that can be added to the top of a crontab is # min hr DoM m DoW.

*      *    *      *     *  command to be executed
┬    ┬    ┬    ┬    ┬
│     │     │     │     │
│     │     │     │     │
│     │     │     │     └───── day of week (0 - 7) (0 or 7 are Sunday, or use names)
│     │     │     └────────── month (1 - 12)
│     │     └─────────────── day of month (1 - 31)
│     └──────────────────── hour (0 - 23)
└───────────────────────── min (0 - 59)

(Wikipedia)

You can also use the * which is the wildcard for every possible value of the five scheduling fields. You can also use */ to have it run at varying times. There are also websites that can check the timing such as http://www.generateit.net/cron-job/ and http://cronwtf.github.com/.

Environment and Path Issues

A problem we see is not calling the proper path when running a rake command. If you are running a rake task you'll want to make sure you set the environment and the path if needed correctly.

Example: A rake task that may work with system gems but not with bundler because of a path and environment issue.

Deploy User Crontab:
30 1 * * * rake ts:index

This code will only index sphinx if you are using system gems and not Bundler. If you are using Bundler you will want to make sure you are either using bundle exec or calling the binstub executables directly within the application. Example: A rake task that works.

Deploy User Crontab:
30 1 * * * cd /data/appname/current & RAILS_ENV=production bundle exec rake ts:index

This command calls both the correct path and also sets the RAILS_ENV environment variable so you get expected results based on the Rails environment running. In some instances you may have to specify the full path to rake in the bundled gems which is /usr/local/bin/bundle exec /data/appname/current/ey_bundler_binstubs/rake.

Cron Output

We commonly see cron jobs that don't have output handled at all or output in an expected manner. The choices for cron job output are to have no output, create a log file of what happened during the rake task, or to only list errors. The first step in deciding proper output handling is whether you want cron to notify you of anything or if your command will handle it internally. If you choose to do nothing when creating your cron and there was output it would attempt to send an email or if ssmtp mail was not configured on your instance the output would be sent to the dead.letter file. If you do not want any output saved from the cron job appending >/dev/null 2>&1 to your command output and send it to /dev/null (/dev/null is a device that discards any data sent to it).

Another option is to capture the output of a rake task running --trace it is possible to add a verbose log with the addition of >/data/deploy/appname/current/log/ts_rake.log >/dev/null 2>&1. The cron job for that would look like this:

30 1 * * * cd /data/appname/current & RAILS_ENV=production bundle exec rake ts:index --trace > /data/deploy/appname/current/log/ts_rake.log >/dev/null 2>&1

The log file will also need to either exist and be writable by the deploy user running the cron job or the user will need write permission to the directory that contains the log file.

It is possible to send the output to email by not capturing the standard out with >/dev/null 2>&1. As stated previously, on our system our systems are not set up to send e-mail. That will need to be set up before having the mail delivered.

Cron running at specific times is recorded by default into /var/log/syslog. You can sudo grep cron /var/log/syslog to look at the cron jobs that have run during the current day. You can check older days by going through the older log files which are rotated daily.

Cron on Engine Yard

Cron jobs are great for scheduled tasks. There are two important things to remember about running applications on Engine Yard Cloud. The first is that the application master or solo instance is the only instance in an environment that the dashboard will install cron jobs. This is something to keep in mind if all of your application instances need to run the script or if the job should be run on a utility instance. The second is that when an application master takeover is initiated the newly transitioned application master doesn't have the full contents of the previous application master. When a takeover occurs the cron jobs from the dashboard have not yet been put in place. Pressing the apply button inside the dashboard will properly install the cron jobs from your dashboard to your new application master.

About Ralph Bankston

Ralph Bankston is the US West Team Lead for the amazing Engine Yard Application Support Team. With an eye toward fine tuning the pieces that make the best possible environment to run applications, Ralph keeps watch over the DevOps kingdom. Outside of work, he is a guitar playing foodie who enjoys long walks on the beach.