Messages & Announcements

2017-06-05: Crane /work filesystem downtime resolved
Category: General Announcement
The /work filesystem for Crane is restored as of 2:55pm.

One of the storage servers crashed and rebooted. A filesystem check was completed with no errors found. Running jobs which were accessing /work stalled until the filesystem was restored. This may have caused jobs to exceed their time limit. There was no data loss from this outage.

We believe the storage server crash was triggered by I/O delays as the RAID controller was rebuilding a failed disk drive. The rebuild is still running and we are monitoring the system.
2017-06-05: Crane /work filesystem unplanned downtime
Category: System Failure
The /work filesystem for Crane is partially unavailable. One of the storage servers crashed and rebooted. We are now running a filesystem check before placing the server back online. Pending jobs will be held until the maintenance is complete.

The filesystem check has been completed with no errors found. The /work filesystem is back online. Running jobs may be affected, but there was no data loss from this outage.

We believe the storage server crash was triggered by I/O delays as the RAID controller was rebuilding a failed disk drive. The rebuild is still running and we are monitoring the system.
2017-05-19: Crane: Maintenance complete
Category: General Announcement
Maintenance is complete on Crane. Check changes made to the cluster in the following details section. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu

Changes made during this downtime:

All partitions on Crane now have a uniform maximum time limit of 7 days.

The opa partition has been removed. To limit job submissions to nodes that have Omni Path fabric, use the following Slurm directive while submitting jobs to the batch partition:

#SBATCH --constraint=opa

The opaguest partition has been renamed to guest. Priority access nodes having Infiniband and Omni Path fabrics have been added to this partition. Use

#SBATCH --constraint=ib
or
#SBATCH --constraint=opa

to limit which nodes your jobs will be considered on. Any job that runs within the guest partition can be preempted by jobs submitted by the owners of the respective hardware.
2017-05-17: Sandhills: Maintenance complete
Category: General Announcement
Maintenance is complete on Sandhills. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu

Maintenance is complete on Sandhills. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu
2017-05-11: Tusker: Maintenance complete
Category: General Announcement
Maintenance is complete on Tusker. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu

Maintenance is complete on Tusker. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu