Messages & Announcements

  • 2010-07-02:  PrairieFire Status
    Category:  Maintenance

    PrairieFire filesystem maintenance nearing completion. System may remain down until Tuesday (following the 4th holiday break). Please contact if you need access to files before then. Firefly and Merritt remain up and available.

    PrairieFire's home filesystem was previously a single large NFS server from SUN, and has failed repeatedly over the last month. It appears no data has been lost. Several steps are nearing completion. First, the /home filesystem for Merritt, previously also served from this box, has been moved to other hardware. Moving forward, Merritt and PrairieFire are decoupled. Next, another smaller filesystems that was served by the same box has been moved to other hardware. Finally, a complete replica of /home for PrairieFire nears completion on a similar server. This will allow us to balance the load over both, as well as providing us with a ready failover system. A third similar box will be optimized for performance and tested after PrairieFire is back online.

  • 2010-06-30:  Service restored to Merritt.
    Category:  General Announcement

    Service has been restored to Merritt. Prairiefire remains inaccessible for maintenance.

    Merritt's /home filesystem has been migrated to a different storage system, and it is now available to run jobs. Users are encouraged to check that their files are intact. Prairiefire remains down to reorganize its /home fileservers.

  • 2010-07-01:  PrairieFire and Merritt unplanned down time
    Category:  System Failure

    PrairieFire and Merritt are in an unplanned down time due to filesystem issues until further notice.

    The home filesystems for Prairiefire and Merritt are not responsive. These systems will remain down until a suitable solution is found. Firefly remains up, and we will work with you if you need access to resources not readily available there. We will announce further details and time frames here as they become available.

    ***** added 7/1/10 *******
    Maintenance of the home filesystem for PrairieFire is requiring extended down time. We are moving to a strategy that uses multiple NFS servers, which requires movement and replication of data. This process is progressing without trouble, but is admittedly very time consuming. PrairieFire will remain off-line at least until the end of this week.

  • 2010-06-23:  Prairiefire Rebooted
    Category:  System Failure

    PRAIRIEFIRE: Rebooted

    Kernel tweeked on NFS home fileserver, rebooted to read new params, will watch to see performance ongoing...

  • 2010-06-20:  Power fluctuation affecting PrairieFire
    Category:  System Failure

    Several nodes of PrairieFire went down Sunday afternoon. A power fluctuation affecting the Schorr Center machine room is suspected. Please check any jobs currently running on PrairieFire.

    An apparent power fluctuation affected several machines in Schorr Center including various PrairieFire worker nodes. These nodes went down early Sunday afternoon. The filesystems and head node were not affected. Further details will be added here as of June 21 if warranted after further investigation. Most nodes that went down have been restored; a few will require further maintenance. Normal system response is expected at this time.