Messages & Announcements

  • 2017-08-16:  Crane unexpected downtime, reboot of login node
    Category:  System Failure

    On Wednesday evening, the Crane login node, crane.unl.edu, had a software issue causing running processes to hang. The service is now running normally.

    Fixing the issue required a reboot of the login node. Running jobs were not affected. We apologize for the inconvenience.


    On Wednesday evening, the Crane login node, crane.unl.edu, had a software issue causing running processes to hang. The service is now running normally.

    Fixing the issue required a reboot of the login node. Running jobs were not affected. We apologize for the inconvenience.

  • 2017-06-05:  Crane /work filesystem downtime resolved
    Category:  General Announcement

    The /work filesystem for Crane is restored as of 2:55pm.

    One of the storage servers crashed and rebooted. A filesystem check was completed with no errors found. Running jobs which were accessing /work stalled until the filesystem was restored. This may have caused jobs to exceed their time limit. There was no data loss from this outage.

    We believe the storage server crash was triggered by I/O delays as the RAID controller was rebuilding a failed disk drive. The rebuild is still running and we are monitoring the system.


  • 2017-06-05:  Crane /work filesystem unplanned downtime
    Category:  System Failure

    The /work filesystem for Crane is partially unavailable. One of the storage servers crashed and rebooted. We are now running a filesystem check before placing the server back online. Pending jobs will be held until the maintenance is complete.


    The filesystem check has been completed with no errors found. The /work filesystem is back online. Running jobs may be affected, but there was no data loss from this outage.

    We believe the storage server crash was triggered by I/O delays as the RAID controller was rebuilding a failed disk drive. The rebuild is still running and we are monitoring the system.

  • 2017-05-19:  Crane: Maintenance complete
    Category:  General Announcement

    Maintenance is complete on Crane. Check changes made to the cluster in the following details section. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu


    Changes made during this downtime:

    All partitions on Crane now have a uniform maximum time limit of 7 days.

    The opa partition has been removed. To limit job submissions to nodes that have Omni Path fabric, use the following Slurm directive while submitting jobs to the batch partition:

    #SBATCH --constraint=opa

    The opaguest partition has been renamed to guest. Priority access nodes having Infiniband and Omni Path fabrics have been added to this partition. Use

    #SBATCH --constraint=ib
    or
    #SBATCH --constraint=opa

    to limit which nodes your jobs will be considered on. Any job that runs within the guest partition can be preempted by jobs submitted by the owners of the respective hardware.

  • 2017-05-17:  Sandhills: Maintenance complete
    Category:  General Announcement

    Maintenance is complete on Sandhills. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu


    Maintenance is complete on Sandhills. Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu

Pages