FAQ


I have an account, now what?

Congrats on getting an HCC account! Now you need to connect to a Holland cluster. To do this, we use an SSH connection. SSH stands for Secure Shell, and it allows you to securely connect to a remote computer and operate it just like you would a personal machine.

Depending on your operating system, you may need to install software to make this connection. Check out our documentation on Connecting to HCC Clusters.

How do I change my password?

I forgot my password, how can I retrieve it?

Information on how to change or retrieve your password can be found on the documentation page: How to change your password

All passwords must be at least 8 characters in length and must contain at least one capital letter and one numeric digit. Passwords also cannot contain any dictionary words. If you need help picking a good password, consider using a (secure!) password generator such as this one provided by Random.org

To preserve the security of your account, we recommend changing the default password you were given as soon as possible.

I just deleted some files and didn’t mean to! Can I get them back?

That depends. Where were the files you deleted?

If the files were in your $HOME directory (/home/group/user/): It’s possible.

$HOME directories are backed up daily and we can restore your files as they were at the time of our last backup. Please note that any changes made to the files between when the backup was made and when you deleted them will not be preserved. To have these files restored, please contact HCC Support at hcc-support@unl.edu as soon as possible.

If the files were in your $WORK directory (/work/group/user/): No.

Unfortunately, the $WORK directories are created as a short term place to hold job files. This storage was designed to be quickly and easily accessed by our worker nodes and as such is not conducive to backups. Any irreplaceable files should be backed up in a secondary location, such as Attic, the cloud, or on your personal machine. For more information on how to prevent file loss, check out Preventing File Loss.

How do I (re)activate Duo?

If you have not activated Duo before:

Please stop by our offices join our Remote Open Office hours or schedule another remote session at hcc-support@unl.edu and show your photo ID and we will be happy to activate it for you.

If you have activated Duo previously but now have a different phone number:

Stop by our offices along with a photo ID and we can help you reactivate Duo and update your account with your new phone number.

Join our Remote Open Office hours or schedule another remote session at hcc-support@unl.edu and show your photo ID and we will be happy to activate it for you.

If you have activated Duo previously and have the same phone number:

Email us at hcc-support@unl.edu from the email address your account is registered under and we will send you a new link that you can use to activate Duo.

How many nodes/memory/time should I request?

Short answer: We don’t know.

Long answer: The amount of resources required is highly dependent on the application you are using, the input file sizes and the parameters you select. Sometimes it can help to speak with someone else who has used the software before to see if they can give you an idea of what has worked for them.

Ultimately, it comes down to trial and error; try different combinations and see what works and what doesn’t. Good practice is to check the output and utilization of each job you run. This will help you determine what parameters you will need in the future.

For more information on how to determine how many resources a completed job used, check out the documentation on Monitoring Jobs.

I am trying to run a job but nothing happens?

Where are you trying to run the job from? You can check this by typing the command `pwd` into the terminal.

If you are running from inside your $HOME directory (/home/group/user/):

Move your files to your $WORK directory (/work/group/user) and resubmit your job. The $HOME folder is not meant for job output. You may be attempting to write too much data from the job.

If you are running from inside your $WORK directory:

Contact us at hcc-support@unl.edu with your login, the name of the cluster you are running on, and the full path to your submit script and we will be happy to help solve the issue.

I keep getting the error “slurmstepd: error: Exceeded step memory limit at some point.” What does this mean and how do I fix it?

This error occurs when the job you are running uses more memory than was requested in your submit script.

If you specified --mem or --mem-per-cpu in your submit script, try increasing this value and resubmitting your job.

If you did not specify --mem or --mem-per-cpu in your submit script, chances are the default amount allotted is not sufficient. Add the line

#SBATCH --mem=<memory_amount>

to your script with a reasonable amount of memory and try running it again. If you keep getting this error, continue to increase the requested memory amount and resubmit the job until it finishes successfully.

For additional details on how to monitor usage on jobs, check out the documentation on Monitoring Jobs.

If you continue to run into issues, please contact us at hcc-support@unl.edu for additional assistance.

I keep getting the error “Some of your processes may have been killed by the cgroup out-of-memory handler.” What does this mean and how do I fix it?

This is another error that occurs when the job you are running uses more memory than was requested in your submit script.

If you specified --mem or --mem-per-cpu in your submit script, try increasing this value and resubmitting your job.

If you did not specify --mem or --mem-per-cpu in your submit script, chances are the default amount allotted is not sufficient. Add the line

#SBATCH --mem=<memory_amount>

to your script with a reasonable amount of memory and try running it again. If you keep getting this error, continue to increase the requested memory amount and resubmit the job until it finishes successfully.

For additional details on how to monitor usage on jobs, check out the documentation on Monitoring Jobs.

If you continue to run into issues, please contact us at hcc-support@unl.edu for additional assistance.

I keep getting the error “Job cancelled due to time limit.” What does this mean and how do I fix it?

This error occurs when the job you are running reached the time limit than was requested in your submit script without finishing successfully.

If you specified --time in your submit script, try increasing this value and resubmitting your job.

If you did not specify --time in your submit script, chances are the default runtime of 1 hour is not sufficient. Add the line

#SBATCH --time=<runtime>

to your script with increased runtime value and try running it again. The maximum runtime on Swan is 7 days (168 hours).

For additional details on how to monitor usage on jobs, check out the documentation on Monitoring Jobs.

If you continue to run into issues, please contact us at hcc-support@unl.edu for additional assistance.

I want to talk to a human about my problem. Can I do that?

Of course! We have an open door policy and invite you to stop by either of our offices anytime Monday through Friday between 9 am and 5 pm. One of the HCC staff would be happy to help you with whatever problem or question you have. join our Remote Open Office hours, schedule a remote session at hcc-support@unl.edu, or you can drop one of us a line and we’ll arrange a time to meet: Contact Us.

My submitted job takes long time waiting in the queue or it is not running?

If your submitted jobs are taking long time waiting in the queue, that usually means your account is over-utilizing and your fairshare score is low, this might be due submitting big number of jobs over the past period of time; and/or the amount of resources (memory, time) you requested for your job is big. For additional details on how to monitor usage on jobs, check out the documentation on Monitoring queued Jobs.

What IP’s do I use to allow connections to/from HCC resources?

Under normal circumstances no special network permissions are needed to access HCC resources. Occasionally, it may be necessary to whitelist the public IP addresses HCC utilizes. Most often this is needed to allow incoming connections for an external-to-HCC license server, but may also be required if your local network blocks outgoing connections. To allow HCC IP’s, add the following ranges to the whitelist:

129.93.175.0/26
129.93.227.64/26
129.93.241.16/28

If you are unsure on how to do this, contact your local IT support staff for assistance. For additional questions or issues with this, please Contact Us.

Why is my job is showing (ReqNodeNotAvail, Reserved for maintenance) before a downtime?

Jobs submitted before a downtime may pend and show (ReqNodeNotAvail, Reserved for maintenance) for their status. (Information on upcoming downtimes can be found at status.hcc.unl.edu.) Any job which cannot finish before a downtime is scheduled to begin will pend and show this message. For example, the downtime starts in 6 days but the script is requesting (via the --time option) 7 days of runtime. If you are sure your job can finish in time, you can lower the requested time to be less than the interval before the downtime begins (for example, 4 days if the downtime starts in 6 days). Use this with care however to ensure your job isn’t prematurely terminated. Alternatively, you can simply wait until the downtime is completed. Jobs will automatically resume normally afterwards; no special action is required.