Cron is a very common job scheduler for linux. It allows users to run the same command or shell script periodically. Typically it is used to automate tasks, for example, every Monday run my script to plot last week's data. There are many guides to using cron and crontab (the command for loading the cron job table).
Cron on JASMIN
Generally cron is disabled on the JASMIN general access machines such as
sci1.jasmin.ac.uk. This is to avoid people killing the machine by setting up lots of processing jobs when better alternatives, e.g. Lotus, are available. However, there are times when it is appropriate to use cron and so a generic cron service machine is provided.
cron.jasmin.ac.uk is configured like
sci*.jasmin.ac.uk, except cron is enabled. Anyone who can log into
sci* should also be able to login to
cron.jasmin.ac.uk. (the actual hostname of the machine is
cron1.ceda.ac.uk but please refer to it as
cron.jasmin.ac.uk wherever possible)
There are a few rules of the road to using this service:
- Avoid process pile up: If a job has not finished before the cron starts the next instance of the same job then competition for resources probably means that job will also not finish. Eventually a mass of unfinished jobs will overwhelm the whole machine and it will crash. To avoid this jobs should test to see if the previous job is still running by using a lock file, or making sure the jobs timeout. The crontamer wrapper script, documented below, is available to help you implement this.
- Expect it to break occasionally: Regardless of any measure introduced by users to stop process pile up, you can expect it to go rouge at some point. We will reboot the machine when this happens, probably without warning. We may remove offending jobs from the cron table but persistent offenders may be barred from using the service.
- Don't do heavy processing or data transfers on the cron machine itself. You can submit jobs to lotus (sbatch is on cron1.ceda.ac.uk) to offload the processing resource.
Common Cron Gotchas
The bash shell environment when cron launches scripts is not identical to the one when working interactively. Annoyingly the path to common tools, like sbatch, may not have been setup so that you get an error message from cron when it works perfectly well interactively. A way to get round this is to source the .bash_profile in the crontab file so that the interactive environment is used by cron.
24 * * * * . $HOME/.bash_profile; sbatch -W 12:0 mycmd.sh
Crontamer is a wrapper script to implement lock file and time out checking for cron jobs to avoid issues of ‘’’runaway’’’ jobs causing problems for everyone (e.g. where new jobs start before old ones finish, which can cause a snowball effect).
You can download the crontamer script from https://github.com/cedadev/crontamer. It is already installed on
bash> crontab -l #e.g. cron file entry to run job every day at 4am: 0 4 * * * crontamer -t 2h '/home/users/jblogs/bin/my_repeat_script.sh -opt1 arg1 arg2'
- Check for existing lock file to indicate if the script is already running. If the lockfile is there and it can see the matching process still running then it exits silently.
- If the lock file is not there or it can't see the matching process on the system then it starts the wrapped script.
- Periodically check the wrapped script is running. If the script fails, with a non-zero exit return code, then it can email you.
- If the script has been running longer than specified timeout (default 12hr) then it will be killed.