Submitting Jobs and the Sun Grid Engine ( SGE )

What is a job? In computing terminology a job is a completely defined computational task.  Most iceberg users' requirement being simply to run an application, job and application is usually synonymous within the context of these guides.  

SGE controls the way computing jobs requested to run on iceberg ( via qsub, qsh or qrsh commands) are scheduled.

Every job is allocated time and memory resource. Currently-

  • if not specified during the job submission, maximum time allowed is   8 wall_clock (i.e. real-time) hours .
  • If not specified the default memory allocation is set to 6 Gigabytes on all the nodes.
  • These allocations apply to both interactive (via qsh or qrsh)  and batch ( via qsub) jobs.

Any job exceeding its allocated time or memory gets terminated without any warning!

 SGE commands

Running Jobs (applications) Interactively:  qsh and qrsh

Using programs with "Graphical User Interfaces" as well as code development activities such as editing, compiling are best done interactively.

  • qsh opens a new X-terminal window on a worker-node, allowing you to run interactive software "such as Matlab" on that worker node.
  • qrsh opens a command shell onto one of the workers in the current window.
  • Do not start multiple cpu-intensive background jobs from a single qsh sessions.
  • To perform multiple compute intensive interactive tasks, issue as many qsh command as needed. 
  • If you start terminal windows ( e.g. via xterm& ), make sure to close them before exiting the qsh session. 

 Running Jobs (applications) in Batch Mode: qsub , runfluent , runmatlab , runabaqus, runansys, runcfx5par

Time consuming, non-graphical jobs are better suited for batch mode of working via the qsub command.

Batch processing involves preparing and submitting a computing job to the cluster to be run later without any user intervention. SGE job scheduling system will then;

  • queue these jobs according to some criteria,
  • when its turn comes run each job on a suitable worker-node,
  • return the results via job output files,
  • Provide tools for job monitoring, job aborting etc.

 qsub Command

    USUAL FORMAT:    qsub scriptfile

    FULL SYNTAX:   qsub  [sge-options]  scriptfile  [-- optional_parameters_to_script]

qsub submits a batch job to the iceberg cluster. To submit a job you must;

  • first create a text file 'scriptfile' containing a set of commands,
  • next submit this file to the SGE using the qsub command.

Your script file will contain a set of Linux commands that will be executed in the order they appear in this file.
We strongly recommend that you specify the shell you are using in the very first line 'some times refered to as the BANG line' of your scriptfile.
Therefore this first line should read as;

  #!/bin/bash   for normal job scripts that uses the bash shell ( which is the default shell in iceberg )
 #!/bin/csh      for those few jobs using the c-shell.

Note: If you are using the module commands to access your software, it is essential to specify this bang-line correctly.
The scriptfile can optionally contain lines starting with #$ that provides information about job requirements as shown in the table below.
For example to request 8 Gigabytes of memory for the job you will have a line that reads;
   #$  -l mem= 8G
  Although we highly recommend it for clarity, it is not essential to place the #$ lines in the beginning of the file as the job scheduler has two passes over the scriptfile.

You can either use a text editor on iceberg to create your scriptfile or prepare it somewhere else and transfer it onto iceberg by using secure ftp.
nedit , gedit , vi , emacs, pico and nano are a few good text editors available on the worker nodes. Note however that the iceberg head-node does not currently have gedit or pico installed. 

List of Useful Options for qsub Command

A few useful qsub options
-l arch=intel*
-l arch=amd*
This flag restricts your job to run only on a node with the specified architecture.
You will not normally need to use this flag unless your software requires the use
of a certain architecture (i.e Intel or AMD ). 
By not specifying this flag your job will be able to run on the first currently available node.
-l h_rt=hh:mm:ss Specify maximum run time (wall-clock) in
hours(hh) , minus(mm) and seconds(ss)
-l mem=nnG For serial jobs, specifies total "virtual" memory requirements in Gigabytes.
For parallel jobs, specifies memory allocation per processor ( and NOT total memory).
Note that this is the case for OpenMP jobs as well.
For example: an OpenMP job with 2 threads needing 24 GBytes in total will need
to specify -l mem=12G . 
Currently default is 6G and maximum total allowed is 128G. 
-l rmem=nnG

For serial jobs, specifies the total "real" 'or resident' memory requirements in Gigabytes.
Real memory specified (rmem)should always be less than or equal to virtual memory (mem) and of course will also be less than the actual physically available RAM memory on a worker node. 
The relative size of the real over virtual memory allocation will effect the amount of paging 'page-fault' that takes place during the execution of a job thus impacting on the efficiency of usage. 
For parallel jobs, specifies real-memory allocation per processor ( and NOT total memory).
Note that this is the case for OpenMP jobs as well.
Currently default is 2G and maximum total allowed is 48G. 

-pe openmpi-ib nn
-pe ompigige  nn
-pe openmp  nn
Specify the parallel MPI environment to use 
and the number of processors needed (nn)
Currently nn must not exceed 32

-m bea
-M email_address

 Notification options: -m can be followed by any 
combination of b (begin) e (end) or (a) abort
-M must specify your email address. Either none or both m and M must be specified.
-o filename
-e filename
 The output from a job is usually sent to two files (e for error and o for normal)
generated from the script and and the JOB_ID number.
This behavior can be overridden by these parameters.
-j option joins normal and error outputs and is HIGHLY recommended.

-v variable=value

 Passes the defined environment variable to the jobs execution environment
 Passes all the environment variables of the current shell to the job.


 Gives a full listing of qsub parameters/options. 

 Easy Ways of Running Some Applications in Batch Mode 

We have made a few home-grown commands available on iceberg to make submitting batch jobs easy for a few commonly used applications. These are; 

  runfluent , runansysruncfx5par, runmatlab  and runabaqus 

For further information on how to use any of these three commands type its name at the bash prompt on iceberg.

Checking the Progress of your Jobs (i.e. applications) : qstat,  Qstat, qmon  

     FORMAT:   Qstat    

  • To get information about your jobs use the Qstat command.
  • To get information on all the jobs on iceberg cluster use the qstat command
  • Both Qstat and qstat command can be used with the –ext option to get more detailed information about the jobs such as memory used. 

Detailed information about all the sge commands is available in the man pages. Type man qstat or man qsub for example.

qmon command is a sophisticated "and at times too complicated" graphical user interface to SGE that can monitor the progress of jobs, submit jobs and also perform a large number of administrative tasks that only the system administrators have rights to and will need to. The first ICON (Job Control Icon) of the GUI for of qmon can be used to check the progress of all jobs on iceberg.  

Cancelling already submitted jobs

   FORMAT:   qdel job_ID

  • qdel command can be used to cancel/abort jobs that are waiting to run or running.  
  • Interactive as well as batch jobs can be aborted. The only reason for having to abort an interactive job would be if the terminal is locked up and no longer responding.
  • Every job, when it is submitted will be assigned a unique job_ID that is a number.
  • At any time you can find out the job_ID of all your jobs by typing Qstat
  • You can cancel a job with job_ID of nnnn by typing qdel nnnn  
  • For obvious reasons you are only allowed to cancel your own jobs. 

Fair Share Mechanism on iceberg

  • Fair share policy is applied according to computational time (i.e. time is allocated fairly with all users having equal share.
  • When you submit your first job it will go to the end of the queue, but the scheduler will quickly move it higher up the queue. As you submit more jobs your usage figure will increase, so the scheduler will become less generous. This means that users who submit less jobs will get a more rapid turnover, but the rapid turnaround of the first job may not be repeated for future jobs.
  • These rules apply to both interactive and batch access.


 SGE Job Queues

Queue Name Max. Real Time Allowed Queue Specific Features
short 8 hours For Quick Jobs
long 168 hours For long running serial jobs
parallel 168 hours For MPI parallel jobs
openmp 168 hours Multi-threaded and OpenMP

 Important: Any job that exceeds the "Maximum Allowed Time" limit is killed as soon as time is exceeded. Therefore it is important that you specify ample time for your jobs. Normally there is no need to specify a particular queue for your job. You simply specify the maximum time and maximum memory ( and in the case of parallel jobs number of processors) needed and SGE will put your job into the correct queue.