Submitting Batch Jobs to the HPC cluster
What is a job? In computing terminology a job is a completely defined computational task. Most HPC users' requirement being simply to run an application, job and application is synonymous within the context of these guides.
On a shared compute cluster, fair-sharing and good-utilization of the cluster's resources are achieved by a job scheduling software. On our HPC facilities, Sun Grid Engine (SGE) is the software that controls the way jobs are scheduled and executed.
Batch processing involves the following steps:
- User submit his/her batch job to the job scheduler (SGE) by using the qsub command,
- Submitted job is placed into queue of jobs waiting to be executed,
- When its turn comes, the job is executed with the requested amount and type of resources allocated to it,
- Results are written into files to be inspected at users convenience.
qstat command can be used for monitoring the progress of the job from submission to completion.
qsub command can take many parameters to pass useful information to the scheduler such as:
- The resources needed to run the job ( time, memory, special hardware etc. )
- Type of execution environment to use ( MPI parallel, OpenMP parallel etc. )
- Other features such as names and locations of job log files, email notifications from the job etc.
If not specified, every job is allocated some default resources.
Currently there are different resource definitions and defaults for running jobs on iceberg and ShARC. Please follow the relevant links below for further information: