Messages & Announcements

  • 2018-05-31:  Tusker: Login node reboot @ 1pm Monday, June 4th
    Category:  Maintenance

    The Tusker login node will be rebooted at 1pm Monday, June 4th to apply configuration changes. During this time it will be inaccessible via SSH. We anticipate the maintenance to take less than 30 minutes. Pending or running jobs and transfers via Globus will be unaffected by this reboot. We recommend exiting all SSH sessions prior to the maintenance to avoid issues from dropped connections.


  • 2018-05-24:  Sandhills: Login node maintenance completed
    Category:  Maintenance

    The maintenance on the Sandhills login node has been completed and the system is available for use.


  • 2018-05-22:  Sandhills: Login node reboot @ 1pm Thurs., May 24
    Category:  Maintenance

    The Sandhills login node will be rebooted at 1pm Thursday, May 24th to apply configuration changes. During this time it will be inaccessible via SSH. We anticipate the maintenance to take less than 30 minutes. Pending or running jobs and transfers via Globus will be unaffected by this reboot. We recommend exiting all SSH sessions prior to the maintenance to avoid issues from dropped connections.


  • 2018-05-19:  Crane: /work filesystem downtime resolved
    Category:  General Announcement

    The /work filesystem for Crane is restored as of 8:00pm.

    One of the storage servers crashed and rebooted. A filesystem check was completed with no errors found. Running jobs which were accessing /work stalled until the filesystem was restored. This may have caused jobs to exceed their time limit. There was no data loss from this outage.

    We believe the storage server crash was triggered by I/O delays as the RAID controller was rebuilding a failed disk drive. The rebuild and filesystem check completed successfully and we are monitoring the system.


  • 2018-05-18:  Crane: /work filesystem unplanned downtime
    Category:  System Failure

    The /work filesystem for Crane is partially unavailable. One of the storage servers is experiencing hardware issues. Pending jobs will be held until the maintenance is complete.


    The /work filesystem for Crane is partially unavailable. One of the storage servers is experiencing hardware issues. Pending jobs will be held until the maintenance is complete.

    A hard drive on a Crane storage server failed and was replaced with a spare. During the RAID rebuild, a second drive in the same volume performed poorly, causing the RAID controller to reset, eventually causing the server to crash. After removing the drive with performance issues, the RAID rebuild completed successfully.

    Filesystem checks are in progress. Once that completes successfully, the /work filesystem will be placed online and Crane back in normal service.

Pages