File Storage on Iceberg

  • All users have a home area in the directory /home/username.
    The initial file storage quota for this area is 5GB and it is backed up nightly.
  • Much larger permanent data-storage area is also provided under the directory /data/username
    Initial quota for this area is 50 GB.It is important to note that this /data area is NOT backed up to tape. However a snapshot copy of the drive is stored on the mirror storage node.
  • All users also have access to large a fast-access data storage area under /fastdata 
    Fastdata area provides 80 TBytes of storage in total.This storage area takes advantage of the new infiniband network for faster access to data. Although /fastdata is available on all the worker nodes, only the access from the new intel-based nodes benefits from these speed improvements. There are no quota controls on /fastdata area but files older than 2 months will be deleted regularly without any warning. This residence period will be increased after the iceberg upgrade, expected to take place by summer of 2014.
    Also, in order to avoid interference from other users' files it is VITALLY IMPORTANT that you store your files in a directory created and named the same as your username.
    E.g.        mkdir      /fastdata/ac1xyz .

    Use the lfs command to find out which files under /fastdata are older than a certain number of days and hence approaching the time of deletion. Example to find files 50 or more days old:

                 lfs  find  -ctime   +50   /fastdata/ac1xyz

  • To find out your current filestore quota allocation and usage type  quota .

If you exceed your file storage allocation

As soon as the quota is exceeded your account becomes frozen. In order to avoid this situation it is strongly recommended that you -

  • Use the quota command to check your usage regularly.
  • copy files that do not need to be backed to the  /data/username area.
    Example:  cp mylargefile  /data/${USER}

 If your account becomes frozen  you can remove unwanted files by using the RM command (note upper case) and you will be automatically unfrozen.

Alternatively, if you have larger file storage requirements please request it by emailing  hpchub@sheffield.ac.uk .  

Efficiency considerations

For jobs requiring a lot of I/O, it may sometimes be necessary to store copies of the data on the actual compute node on which your job is running. For this, you can create temporary areas of storage under the directory /scratch . The next best I/O performance that requires minimum amount of work is achieved by keeping your data in /fastdata area and running your jobs on the new intel nodes ( by specifying -l arch=intel* ) .

These methods provide much faster access to data than the network attached storage on either /home or /data areas, but you must remember to copy important data back onto your /home area. 

If you decide to use the /scratch area we recommend that under /scratch you create a directory with the same name as your username and work under that directory to avoid possibility of clashing with other users.

The /scratch area is local to each worker node and is not visible to the other worker nodes or to the head-nodes. Therefore any data created by jobs should be transfered to your either /data or /home area before the job finishes. Here is an example script that will copy filex from your /home area into scratch and copy filey from scratch area to your home directory;

 

mkdir   /scratch/${USER}
cp   ${HOME}/filex    /scratch/${USER}
#  run your program now that uses filex as input and produces filey as its output
cp   /scratch/${USER}/filey    ${HOME}

 

 Anything under the /scratch is deleted periodically when the worker-node is idle, whereas files on the /fastdata area will be deleted only when they are 3 months old.