The Delta Slurm resource scheduler software and job database will be upgraded on Monday, December 22nd starting at 7AM and ending at 11AM, approximately. The Slurm software will be upgraded from 23.11.9 to 25.11.0. The most noticeable change will be how Slurm sees numa memory domains as sockets by setting the Slurm numa_node_as_socket parameter, which is helpful for CPU-GPU affinitization. If you use the sbatch, srun, or salloc -per-socket options such as --cores-per-socket or --gpus-per-socket then please check your jobs once the scheduler resumes job scheduling. See the Slurm socket affinity page for more information.
After the maintenance:
- The RH9 Slurm reservation will be removed and no longer needed when submitting jobs from dt-login02, dt-login03 or dt-login04.
- Login node, dt-login01, and 1/4 of the compute nodes will be available for users who have not migrated to the upgraded OS.
- The RH8 reservation will be set for jobs submitted from dt-login01.
During the maintenance period:
- All computes nodes will be unavailable.
- Login nodes, file systems, Globus endpoints and other non-job related services will remain available.
- The Delta OnDemand service will be available but interactive applications that require a compute node such as Jupyter notebook, VS Code server, and X Desktop will not be able to run.
A reservation will be in place to prevent jobs from running into the maintenance period.
Please be sure to check job time requirements as December 22nd approaches so that jobs can be scheduled as the reservation drains the available nodes. Adjusting the time limit to account for the start time of the reservation will allow jobs to run.
The resource scheduler will resume once the maintenance is complete.
Please submit comments or questions by using the NCSA Help portal (https://help.ncsa.illinois.edu/) or by email to help@ncsa.illinois.edu.