Messages & Announcements

  • 2014-06-06:  hcc-support upgrade scheduled
    Category:  General Announcement

    The ticketing user support system that is tied to the email address hcc-support@unl.edu will be upgraded next week. Tickets sent Monday morning from 6:30am - 8:30am may be delayed. If no confirmation reply is received by Monday afternoon, please resend the request. A subsequent brief outage is likely on Wednesday as well; please let us know (directly!) if you experience any difficulties with the ticketing system during this time.

    Best regards,
    David Swanson


  • 2014-05-22:  Crane restored
    Category:  Maintenance

    A core switch failure has been repaired. Crane is back up; jobs in queue will begin starting shortly.


  • 2014-05-21:  Unplanned downtime for Crane
    Category:  System Failure

    This outage affects Crane only. A problem has developed with Crane's InfiniBand fabric. The batch partition has been paused to keep new jobs from starting. Running jobs likely experienced failures interacting with the /work file system. HCC staff are looking into the problem.


    This outage affects Crane only. A problem has developed with Crane's InfiniBand fabric. The batch partition has been paused to keep new jobs from starting. Running jobs likely experienced failures interacting with the /work file system. HCC staff are looking into the problem.

  • 2014-05-06:  Tusker back online
    Category:  General Announcement

    Tusker is online and ready for use as of this morning. The /work filesystem has been largely restored. There were a small percentage of files corrupted and thus lost. Users are asked to please check their files, or simply start back to work and monitor things carefully for the first few jobs submitted. The recovery process may have resulted in some files having permission issues -- let us know if you find any instances of this.


    The lustre filesystem has been completely upgraded, including a hardware repair, and is now running current versions of the filesystem and the firmware. Other system upgrades were also implemented. A small percentage of files were lost altogether (estimated at 0.006%) -- we do realize if a key file is lost, it is hollow comfort that most didn't suffer that fate. If you find you are missing files, or notice other file system issues, contact hcc-support@unl.edu and we will do our best to help you recover as quickly as possible.

    /work is built for performance -- not permanence. There is very little chance we will ever have even a partial backup of /work on any machine in the future. For a filesystem failure, we were fortunate to have a just-in-case copy due to a planned major downtime this June. That is now not planned. (!)

    David Swanson

  • 2014-05-05:  Tusker update
    Category:  Maintenance

    Tusker is close to returning online, but it is likely things will need to wait until tomorrow. The vast majority of the /work filesystem recovery is finished. Final verification steps are being run to ensure the restoration is complete and correct. A further email will be sent later today with more details -- however, these steps do still require Tusker to stay offline for now.