Data Storage

Sensitive and Protected Data
HCC currently has no storage that is suitable for HIPAA or other PID data sets. Users are not permitted to store such data on HCC machines.

All HCC machines have three separate areas for every user to store data, each intended for a different purpose. In addition, we have a transfer service that utilizes Globus Connect.


Home Directory

You can access your home directory quickly using the $HOME environmental variable (i.e. ‘cd $HOME').

Your home directory (i.e. /home/[group]/[username]) is meant for items that take up relatively small amounts of space. For example: source code, program binaries, configuration files, etc. This space is quota-limited to 20 GiB and 1 M files per user. The home directories are backed up for the purposes of best-effort disaster recovery. This space is not intended as an area for I/O to active jobs.


Common Directory

You can access your common directory quickly using the $COMMON environmental variable (i.e. ‘cd $COMMON’)

The common directory operates similarly to work and is mounted with read and write capability to worker nodes all HCC Clusters. This means that any files stored in common can be accessed from Swan, making this directory ideal for items that need to be accessed from multiple clusters such as reference databases and shared data files.

Common is not designed for heavy I/O usage. Please continue to use your work directory for active job output to ensure the best performance of your jobs.

Quotas for common are 30 TiB and 5 M files per group, with larger quotas available for lease if needed. However, files stored here will not be backed up and are not subject to purge at this time. Please continue to backup your files to prevent irreparable data loss.

Additional information on using the common directories can be found in the documentation on Using the /common File System


High Performance Work Directory

You can access your work directory quickly using the $WORK environmental variable (i.e. ‘cd $WORK').

File Loss
The /work directories are not backed up. Irreparable data loss is possible with a mis-typed command. See Preventing File Loss for strategies to avoid this.

Every user has a corresponding directory under /work using the same naming convention as /home (i.e. /work/[group]/[username]). We encourage all users to use this space for I/O to running jobs. This directory can also be used when larger amounts of space are temporarily needed. There is a 50 TiB and 5 M files per group quota; space in /work is shared among all users. It should be treated as short-term scratch space, and is not backed up. Please use the hcc-du command to check your own and your group’s usage, and back up and clean up your files at reasonable intervals in $WORK.


Purge Policy

HCC has a purge policy on /work for files that become dormant. After 6 months of inactivity on a file, an automated purge process will reclaim the used space of these dormant files. HCC provides the hcc-purge utility to list both the summary and the actual file paths of files that have been dormant for 24 weeks. This list is periodically generated; the timestamp of the last search is included in the default summary output when calling hcc-purge with no arguments. No output from hcc-purge indicates the last scan did not find any dormant files. hcc-purge -l will use the less pager to list the matching files for the user. The candidate list can also be accessed at the following path:/lustre/purge/current/${USER}.list. This list is updated twice a week, on Mondays and Thursdays.

/work is intended for recent job output and not long term storage. Evidence of circumventing the purge policy by users will result in consequences including account lockout.

If you have space requirements outside what is currently provided, please email hcc-support@unl.edu and we will gladly discuss alternatives.


Attic

Attic is a near line archive available for lease at HCC. Attic provides reliable large data storage that is designed to be more reliable then /work, and larger than /home. Access to Attic is done through Globus Connect.

More details on Attic can be found on HCC’s Attic website.


Globus Connect

For moving large amounts of data into or out of HCC resources, users are highly encouraged to consider using Globus Connect.


Using Box

You can use your UNL Box.com account to download and upload files from any of the HCC clusters.