Messages & Announcements

  • 2015-08-10:  Tusker unplanned downtime
    Category:  System Failure

    UPDATE: Tusker emergency system maintenance starting Monday morning

    Emergency system maintenance is necessary to correct the issues present on Tusker. The corrective steps will include a Lustre filesystem consistency scan which will require taking the system offline. It is anticipated the scan will take many hours to complete. Follow-up checks may be necessary after this initial scan so no estimates will be made for when the system will be returned to service at this time. Jobs which are running will be re-queued but held in pending state. The login and tusker-xfer systems will not be available during the filesystem scan to minimize issues. We will send further announcements when circumstances warrant or when the system is made available again.


    UPDATE: Tusker emergency system maintenance starting Monday morning

    Emergency system maintenance is necessary to correct the issues present on Tusker. The corrective steps will include a Lustre filesystem consistency scan which will require taking the system offline. It is anticipated the scan will take many hours to complete. Follow-up checks may be necessary after this initial scan so no estimates will be made for when the system will be returned to service at this time. Jobs which are running will be re-queued but held in pending state. The login and tusker-xfer systems will not be available during the filesystem scan to minimize issues. We will send further announcements when circumstances warrant or when the system is made available again.

  • 2015-08-09:  Tusker: Issues with /work filessytem on tusker (ongoing)
    Category:  System Failure

    The lustre filesystem for /work on tusker is currently experiencing issues. We suspect the root cause of this issue began Thursday/Friday of last week but at that time there were no symptoms. Currently the filesystem is read-only for the majority of nodes and users, and attempts to write will almost certainly fail. Many files are still readable and thus we will not do an emergency shutdown over the remainder of the weekend.

    The situation will be addressed in detail on Monday when HCC staff are able to focus on it. Until then expect /work to remain read-only. We will send an additional announcement once the problem is understood in more detail or resolved.


    The lustre filesystem for /work on tusker is currently experiencing issues. We suspect the root cause of this issue began Thursday/Friday of last week but at that time there were no symptoms. Currently the filesystem is read-only for the majority of nodes and users, and attempts to write will almost certainly fail. Many files are still readable and thus we will not do an emergency shutdown over the remainder of the weekend.

    The situation will be addressed in detail on Monday when HCC staff are able to focus on it. Until then expect /work to remain read-only. We will send an additional announcement once the problem is understood in more detail or resolved.

  • 2015-08-03:  OpenMPI 1.6 mea culpa
    Category:  General Announcement

    We will not be removing OpenMPI 1.6 today; further, there are no plans to do so in the near future. Upon further investigation, the correlation of catastrophic job failures with usage of OpenMPI 1.6 was only that. There does not appear to be any urgent need to upgrade from OpenMPI 1.6 to OpenMPI 1.8. It is best practice to utilize current software, and if you have already upgraded and seen no issues, all is well. If this caused you any inconvenience, I apologize.


    Many applications run fine with either OpenMPI 1.6 or 1.8 . There have been some issues with OpenMPI 1.8, however, just as there have been with the other deprecated versions of OpenMPI. Whatever the details, If you see unexpected issues, please contact us at hcc-support@unl.edu.

  • 2015-08-01:  RESCHEDULED: HCC Sandhills Downtime Planned - August 10, 2015
    Category:  General Announcement

    Sandhills will have a system downtime starting 8:00am August 10 for system updates. Because of the number of updates to various components of Sandhills, it is anticipated this downtime will take more than a day to complete.


    To minimize the impact to running jobs we are declaring a downtime for Sandhills to complete this work. We will use this maintenance window to update various software components across the cluster. The Sandhills login node will also be updated and will require users to log off. Users will be denied access to the Sandhills login node until the maintenance is completed.

  • 2015-07-23:  Sandhills only: OpenMPI update delay
    Category:  General Announcement

    Sandhills will have a one-day downtime August 12 for system updates, and recompiles to upgrade the MPI version should wait until after that date. Errors due to MPI version are currently very rare on Sandhills.