Messages & Announcements

  • 2016-09-22:  Tusker: /work filesystem performance issues resolved
    Category:  General Announcement

    The Tusker /work filesystem is running normally. A reboot of the Lustre metadata server (MDS) was required.


    The Tusker /work filesystem is running normally. A reboot of the Lustre metadata server (MDS) was required.

    The issue on the MDS looked very similar to the performance degradation experienced 2016-09-07. A different process was used to recover the MDS and the logs indicate that no Lustre clients were evicted. The Slurm logs show jobs that ended during the MDS recovery window did so successfully or ran into job time limits.

    These are promising log indications that running jobs blocked when interacting with /work and recovered when the MDS returned to service.

    We still advise to check your job state if you had running jobs on Tusker as the degraded performance of /work may have impacted jobs that ran into time limits.

    Please contact us at hcc-support@unl.edu with any questions.

  • 2016-09-22:  Tusker: /work filesystem performance issues
    Category:  System Failure

    We are investigating performance issues with the Tusker /work filesystem. There may be disruption to running jobs as we work to correct the issue. We will follow up with details when the maintenance is complete.


    We are investigating performance issues with the Tusker /work Lustre filesystem. Around 5:00am on Thursday, performance of the Tusker /work filesystem dropped significantly.

    To correct the problem, we are restarting Lustre services. This may disrupt running processes which are reading or writing files on /work.

    We will follow up with more details when the maintenance is complete.

  • 2016-07-21:  Draft: file removal from /work on all HCC machines
    Category:  General Announcement

    This notice concerns a policy that affects all HCC machines and potentially all HCC users.
    SUMMARY:
    HCC is implementing a new automated file purge policy on the /work filesystem for all HCC machines. Starting August 1, 2016 we will remove any files on /work which have not been accessed for at least 6 months. This will not affect the /home filesystems or the Attic storage system.


    EXPLANATION:
    The /work filesystem exists on each HCC machine for working files. It is not designed, or intended, for long term storage. The /work filesystem periodically is filled near capacity and this requires files to be deleted to keep the system as a whole available for ongoing use. To date, we have used a somewhat manual process of warning the user community and relying upon voluntary file removal. This is no longer sufficient due to the number of users and the number of accumulated files (e.g. Tusker is currently precariously close to going off-line due to /work being filled). The prior method will be augmented going forward with the automatic removal of all files that have not been accessed for over 6 months. Artificial activity to circumvent this policy will be considered misuse of the system. Longer term file storage is offered by HCC on Attic for an annual fee. This year, that fee has dropped from $100/TB/year to $60/TB/year.

  • 2016-09-09:  SANDHILLS available after power outage
    Category:  System Failure

    Partial power outage in SANDHILLS resolved


    A weather-related power outage around 3:30pm caused worker nodes in SANDHILLS to become unavailable. Power is restored and SANDHILLS is fully operational. Running jobs were killed because of the outage but we think that no files were impacted. Please send an email to hcc-support@unl.edu if you find any problems.

  • 2016-09-07:  Tusker: /work filesystem performance issues resolved
    Category:  General Announcement

    The Tusker /work filesystem is running normally. A reboot of the Lustre metadata server (MDS) was required. Unfortunately, this may have caused job failures for processes reading or writing the /work filesystem at the time. If you had running jobs on Tusker, please check their state as it may be necessary to resubmit these jobs.

    Please contact us at hcc-support@unl.edu with any questions.