Using the /common File System

Quick overview: 

  • Connected read/write to all HCC HPC cluster resources – you will see the same files “in common” on any HCC cluster (i.e. Swan).
  • 30 TB Per-group quota at no charge – larger quota may be available upon request.
  • No backups are made!  Don’t be silly!  Precious data should still be stored / backed up elsewhere such as on Attic.  Please. 
  • No purge!  So, while your files just might be lost in the case of disk failure or user error, they won’t be removed by the purge scripts. 

Accessing common

Your /common directory can be accessed via the $COMMON environment variable, i.e. cd $COMMON.

How should I use /common

  • Store things that are routinely needed on multiple clusters.
  • /common is a network attached FS, so limit the number of files per directory (1 million files in a directory is a very bad idea).
  • If you are accessing /common for a job, you will need to add a line to your submission script!  
    • We have each user check out a “license” to access /common for a given job.
    • This allows us to know exactly who is accessing it, and for how long, in case of the need for a shut down so we can try to avoid killing jobs whenever possible.
    • It also allows us to limit how many jobs can hammer this single filesystem so it remains healthy and happy. 

To gain access to the path on worker nodes, a job must be submitted with the following SLURM directive:

SLURM Submit File
#SBATCH --licenses=common

If a job lacks the above SLURM directive, /common will not be accessible from the worker nodes.  (Briefly, this construct will allow us to quickly do maintenance on a single cluster without having to unmount $COMMON from all HCC resources).

What should I not do when using /common?

  • Don’t use it for high I/O work flows, use /work for that – /common should mostly be used to read largely static files or data.
  • Do not expect your compiled program binaries to work everywhere!  /common is available on machines with different cpu architecture, different network connections, and so on.  caveat emptor!
    • Serial codes will not be optimized for all clusters.
    • MPI codes, in particular, will likely not work unless recompiled for each cluster.
    • If you use  module things should be just fine!

/common and used space reporting

The /common file system has the capability to compress files so they store less data on the underlying disk storage. Tools like du will report the true amount of space consumed by files by default. If the files have been compressed before being stored to disk, the report will appear smaller than what may be expected. Passing the --apparent-size argument to du will cause the report to be the uncompressed size of consumed space.

$ pwd
$ python -c 'print "Hello World!\n" * 2**20,' > hello_world.txt
$ ls -lh hello_world.txt 
-rw-r--r-- 1 demo01 demo 13M Mar  7 12:55 hello_world.txt
$ du -sh hello_world.txt 
2.0K    hello_world.txt
$ du -sh --apparent-size hello_world.txt 
13M     hello_world.txt