Dienste des ZIH
Backup Datennetz Exchange Hochleistungsrechnen Internet-Anbindung
Lizenz-Server Login Shibboleth Software-Verteilung

Weitere zentrale Dienste der TUD
FIS SAP Selma WCMS

Informationen zur ausgewählten Ankündigung / Meldung

zurück


Global maintenance on Lustre and Infiniband (beendet)
From: 26.09.2018 / 12:00
To: 02.10.2018 / 20:00
Dear HPC users,

we will update our Lustre file systems in the first week of October. With this we expect our Lustre bugs to vanish.

On Monday, October 1, we will umount the /lustre/ssd file system and completely update the Lustre servers. We have no redundancy here, so for this time, the SSD file system will not be available.

On Tuesday, October 2, we are going to update /scratch. Here, we can use the high-availability setup to update first half of the servers, then switch the file systems and update their buddies. During this "switch" we will move the servers to ne IP addresses and update the Infiniband subnet manager as well. At that time we will have a complete stop of all HPC systems including login and export nodes.

We have to make sure that no data is compromised during these actions. For this, we have set a maintenance reservation for Tuesday. Optimistically this downtime is about 2 hours. We also use this maintenance to improve the stability of the batch system.

If you are working on /lustre/ssd please read the next carefully…: Use the Slurm option "--license=ssd" to indicate that your job can only run with the SSD file system. On September 28 (Friday), we will stop scheduling job with this option to prevent them from running on October 1. All jobs that are using /lustre/ssd and still running at the beginning of the maintenance will be canceled forcefully.

Sorry for the inconveniences,

Ulf Markwardt

PS: We don't expect data loss on /scratch due to the redundancy of the servers. Please keep in mind that data from /lustre/ssd might get lost.

--
Aktuelle Ankündigungen >
Abgelaufene Ankündigungen >

RSS feed - TUD / ZIH >

::. grafische History .::


Login