Operational Status

Current

  • Running normally.

Past Notifications

  • 2018-12-09 to 2018-12-10 An out of memory issue on the consortium login node prevented services such as SLURM and login. This was resolved via a reboot.

  • 2018-09-10 15:00 An issue with DRBD may be causing some issues. There will be background work to address this 9am-midday on 11th September, which should not affect the system, but will represent risk.

  • 2018-09-07 14:00 GPFS is up, but data centre work is continuing, and so the system is considered at risk.

  • 2018-09-07 10:30 GPFS failed, which took out running jobs.

  • 2018-09-07 09:00 Connectivity is up, but some issues with GPFS apparent.

  • 2018-09-05 13:00 Connectivity is down due to a water leak affecting networking externally to the system. Running jobs will continue, and new jobs will be taken from the queue and run – only login is affected. It is anticipated service will be restored by lunchtime Friday 7th September.

  • 2018-06-06-13:00 maintenace on consortium login node overrunning.

  • 2018-06-06-09:00 maintenance started on login nodes

  • 2017-11-16 – Fully operational.

  • 2017-07-18-10:00 The system is up for logins and jobs, although HA is not fully operational whilst some things go on in the background.

  • 2017-07-17-15:35 The system is currently experiencing temporary issues which affect things like new logins and new job submissions. We do not believe that this will affect running jobs. We expect the system to be up by the close of business on 2017-07-17, or early on 2017-07-18.