Messages & Announcements

  • 2018-11-28:  Crane: GPU driver update completed
    Category:  Maintenance

    The GPU driver updates have been successfully completed and the GPU nodes are back in service.

    **If you are using your own conda environments for GPU jobs, please note:**
    You may need to update certain packages in your conda environment(s) to newer versions in order to avoid errors with the updated drivers. We highly recommend running a small test job to check functionality before resuming large-scale jobs. If you encounter any issues, please contact us at hcc-support@unl.edu and we will be happy to assist you.

    Please contact hcc-support@unl.edu with any questions or issues regarding this maintenance.


  • 2018-11-28:  Crane: GPU driver update Wed., Nov 28 @ 9am
    Category:  Maintenance

    The Crane GPU nodes will be unavailable on Wednesday, Nov. 28th starting at 9am to apply GPU driver updates. This update is necessary to support newer versions of the CUDA Toolkit. Jobs will be held prior to the maintenance and will resume normally afterwards. During this time, running a Jupyter Notebook with GPU support will not be possible. We expect the maintenance to be concluded no later than 5pm the same day.

    Please contact hcc-support@unl.edu with any questions or issues regarding this maintenance.


  • 2018-10-23:  Upcoming removal of $WORK on Tusker
    Category:  General Announcement

    During the upcoming relocation of the Tusker resource scheduled for early 2019, ALL DATA ON THE $WORK FILE SYSTEM WILL BE LOST. In preparation for the migration, USERS WILL NEED TO MOVE ANY IRREPLACABLE DATA FROM $WORK AS SOON AS POSSIBLE AND NO LATER THAN DECEMBER 15TH. After successful completion of the move, data can then be reloaded onto Tusker. In the interim, possible storage solutions would be $COMMON and Attic. $COMMON will allow the convenience of keeping the data accessible from Crane. For additional data security, our extended resource Attic provides a backed up option at a minimal cost. For more information on the relocation, please see the announcement posted in the September release of the Holland newsletter: https://newsroom.unl.edu/announce/holland/8444/48362 If you have any questions or concerns regarding this, please contact us at hcc-support@unl.edu so we can assist you.


  • 2018-10-19:  Anvil, Crane and Tusker services restored
    Category:  General Announcement

    HCC's datacenter at PKI in Omaha suffered an unexpected power outage the morning of Friday, Oct 19th during a preventative maintenance window.

    This type of maintenance has occurred without issue many times in the past and requires the datacenter UPS (battery backup) be bypassed meaning all equipment relies directly on city power. While the bypass was in place there was an issue with the city power feed which caused many servers to reboot unexpectedly and various pieces of networking to fail.

    HCC staff has worked throughout the day to restore services and believes we have done so at this time. All services hosted at PKI were affected including:

    - ANVIL: Many VMs hosts were rebooted including the instances running on those hosts. Please check your instances and contact hcc-support@unl.edu with your instance ID if you have any problems.

    - CRANE / TUSKER : Running jobs were killed and users should check their /home and /work files that may have been open or in the process of being written. Files being written during the power outage are likely lost or corrupted.

    - COMMON Filesystem : Users should check their files exist and are accessible. Files being written during the power outage are likely lost or corrupted.

    This is the first major power issue at this datacenter in a very long time and we will investigate and take any possible actions to prevent it from happening again. At this time it appears to have simply been a very unfortunate coincidence of being off battery power while the main power feed had an unexpected failure.

    Please contact hcc-support@unl.edu with any questions or issues resulting from this outage.


    HCC's datacenter at PKI in Omaha suffered an unexpected power outage the morning of Friday, Oct 19th during a preventative maintenance window.

    This type of maintenance has occurred without issue many times in the past and requires the datacenter UPS (battery backup) be bypassed meaning all equipment relies directly on city power. While the bypass was in place there was an issue with the city power feed which caused many servers to reboot unexpectedly and various pieces of networking to fail.

    HCC staff has worked throughout the day to restore services and believes we have done so at this time. All services hosted at PKI were affected including:

    - ANVIL: Many VMs hosts were rebooted including the instances running on those hosts. Please check your instances and contact hcc-support@unl.edu with your instance ID if you have any problems.

    - CRANE / TUSKER : Running jobs were killed and users should check their /home and /work files that may have been open or in the process of being written. Files being written during the power outage are likely lost or corrupted.

    - COMMON Filesystem : Users should check their files exist and are accessible. Files being written during the power outage are likely lost or corrupted.

    This is the first major power issue at this datacenter in a very long time and we will investigate and take any possible actions to prevent it from happening again. At this time it appears to have simply been a very unfortunate coincidence of being off battery power while the main power feed had an unexpected failure.

    Please contact hcc-support@unl.edu with any questions or issues resulting from this outage.

  • 2018-10-19:  Anvil, Crane and Tusker impacted by power outage at PKI data center
    Category:  General Announcement

    HCC staff are investigating nodes impacted now and are working to bring the systems back online. A follow-up announcement will be sent once the systems are brought to a production state.


    HCC staff are investigating nodes impacted now and are working to bring the systems back online. A follow-up announcement will be sent once the systems are brought to a production state.

Pages