UCloud Status

Ongoing

Extraordinary maintenance
We have scheduled an extraordinary maintenance window with short notice, because we need to rollout some important changes to the system. All compute nodes will be rebooted during the maintenance window and they might be unavailable for up to two hours. All running jobs will be terminated. May 19, 20:00 - now

Updates
Issues with CPU nodes
We are experiencing some stability issues with the new cpu-amd-zen5 machines. As a result, they sometimes need to be rebooted, affecting all jobs running on the node. We are working on finding a solution. May 6, 15:32 - now

Updates

May, 2026

Maintenance window from our ISP
Our ISP has announced a maintenance window on Friday (May 8th) that very likely will disrupt our internet connection to Bitten. May 8, 06:00 - May 8, 13:38

Updates
Power loss at SDU
SDU is currently experiencing a power loss, which is also affecting services in our data center. This is also affecting the storage system at Bitten, which cannot talk to the second tier at the SDU site. May 6, 16:10 - May 6, 21:00

Updates
Internet connection down
We temporarily lost our internet connection to the new data center. The problem has been identified by the ISP and we are waiting for them to solve it. May 5, 15:23 - May 6, 05:05

Updates
The connection was finally restored again around 5am. May 6, 05:20

April, 2026

Extended UCloud downtime
We are moving the UCloud platform to a new data center and this will require extended downtime. Downtime will start on Monday, April 27th, and continue for up to one week. The system is expected to be back online Monday, May 4th, at the latest. Apr 27, 08:00 - May 4, 07:00

Updates
The migration has been completed and the system is back online. May 4, 07:01 We are now taking UCloud offline for the data center migration. Apr 27, 07:59

March, 2026

Background tasks terminated for maintenance
Unfortunately we have had to terminate background tasks (file transfers and copy operations). This was needed to deploy an updated which should improve the stability of the same feature. The background tasks have to be resubmitted to continue. We apologize for the inconvenience. Mar 17, 10:40 - Mar 17, 10:40

Updates

February, 2026

UCloud downtime
UCloud was down from 01:00 until 08:55 due to a bug in the UCloud code. We have temporarily disabled the broken code and are working on deploying a permanent fix. UCloud is expected to work normally while we work on a fix. Feb 1, 01:00 - Feb 1, 08:55

Updates

January, 2026

Problems with Nvidia H100 cards
After an update of the Nvidia drivers we are now experiencing problems with the Nvidia H100 (u3-gpu) machines. We will start downgrading the drivers again next week to restore full functionality. Jan 31, 11:32 - Feb 4, 07:47

Updates
Drivers have been downgraded on all machines that experienced problems. Feb 4, 07:48
Unable to attach public IPs
We are aware of an issue causing public IPs to not correctly be attached to machines. We are working on a fix. Jan 29, 07:04 - Jan 29, 09:30

Updates
The issue has been resolved. Jan 29, 09:30
Downtime for all services
UPDATE: This maintenance window has been expanded and it now covers ALL services offered by SDU eScience.
---
We will be performing maintenance on January the 28th between 12:00 and 20:00. UCloud will be down during this period and jobs on SDU/K8s and AAU/K8s will be terminated at the start of the maintenance window. Jan 28, 12:00 - Jan 28, 17:27

Updates
Update has completed. Please report errors that you may find. Jan 28, 17:28 First part of the maintenance has been completed. We are now working on deploying the new version of UCloud. Jan 28, 15:20

December, 2025

Downtime for all services
On December 1st there will be scheduled maintenance in the SDU data center, which requires a complete shutdown of all servers. For this reason UCloud and all other services offered by SDU eScience will be unavailable during the entire working today. We expect systems to be back online late in the afternoon. Dec 1, 07:00 - Dec 1, 15:16

Updates
All services should be back up and running. Dec 1, 15:16

November, 2025

Hardware problems for u2-gpu
The u2-gpu machines are currently experiencing hardware problems. User jobs are able to run, but they can be killed at any point due to maintenance. Nov 18, 09:00 - Dec 3, 11:32

Updates
The hardware has been replaced and all GPUs are working again. Dec 3, 11:33 The machine has been powered off and hardware replacements should be performed later today. Dec 3, 09:21
UCloud is experiencing issues
UCloud is currently experiencing issues, we are working on fixing the problem. Nov 17, 19:26 - Nov 17, 20:21

Updates
UCloud partially unavailable
During the night UCloud had an internal issue, which made it impossible to access jobs and files. Nov 14, 00:12 - Nov 14, 06:48

Updates
UCloud has been updated
UCloud has been updated with the latest round of bug fixes and improvements to the UI. As always, this may have caused a few minutes of disruption to the service. Sorry for the inconvenience. Nov 11, 09:37 - Nov 11, 09:36

Updates

October, 2025

UCloud jobs will be terminated
On Sunday, October 26th, several UCloud jobs that have been running for more than 14 days will be terminated due to hardware maintenance. Oct 26, 10:00 - Oct 26, 18:07

Updates
The jobs have been terminated. Oct 26, 18:07
SDU/K8s unavailable
The SDU/K8s provider was unavailable between 12/10/25 15:56:34 and 12/10/25 17:04:25 due to a software bug. A bug fix has been released now (13/10/25 07:00). Oct 12, 15:56 - Oct 12, 17:04

Updates

September, 2025

UCloud is being restarted for an update
UCloud is being restarted for an update. The update will take a few minutes. Sep 30, 08:50 - Sep 30, 08:52

Updates
Two H100 nodes rebooted
Two of the H100 nodes were rebooted this morning due to hardware maintenance, the first one around 9.00 and the second one around 10.30 A couple of jobs where stopped during the reboot. Sep 26, 08:00 - Sep 26, 11:05

Updates
Compute nodes unresponsive
Around 8:50 this morning we started decommissioning an old storage system, which unfortunately affects the SDU/K8s compute nodes, making them partially unresponsive. Sep 22, 08:52 - Sep 22, 10:45

Updates
Things should finally be returning to normal. If jobs are stuck for more than 10 minutes, start a new one. Sep 22, 10:31
Power fluctuation caused machines to reboot
A fluctuation in the power grid caused around 30 machines to reboot in the UCloud server room. Jobs running on the machines were terminated during the reboot. Sep 9, 21:45 - Sep 10, 06:30

Updates
Issues with nodea0-19
The machine 'nodea0-19' has been rebooted due to an error with one of the GPU cards. We are monitoring the node for a potential hardware issue. Sep 2, 07:35 - Sep 19, 13:19

Updates
The faulty GPU has been replaced. Sep 19, 13:19 Hardware maintenance has been initiated. Sep 19, 12:46 A support case has been opened to get the card replaced. Sep 10, 07:49 The error has reappeared, the card will most likely need to be replaced. Sep 4, 07:48

August, 2025

UCloud degraded
Between 00:21 and 08:00 UCloud was running with elevated error rates causing the job page to not load correctly. The issue has been resolved. Aug 29, 00:21 - Aug 29, 08:01

Updates
Jobs unable to start on SDU/K8s
We are aware of a situation causing jobs to not start. We believe this is related to the storage system. We are currently investigating the situation. Aug 20, 12:26 - Aug 20, 15:22

Updates
'Type 1 - KU' storage allocations unavailable
Projects receiving storage allocations from "Type 1 - KU" for SDU/K8s were temporarily unavailable. The allocations should be available once again now. Aug 15, 11:03 - Aug 15, 11:02

Updates
SDU/K8s usage tracked incorrectly for storage
We are aware of an issue causing usage to be tracked incorrectly for storage. We will restart the system to fix this issue. Aug 14, 11:27 - Aug 14, 11:39

Updates
We have completed a reset of usage numbers in storage which we believe will fix the issue. The system is now back online after a few minutes of downtime. Aug 14, 11:39
SDU/K8s some jobs are slow to start
We are experiencing slower storage performance on some nodes, which are leading to jobs being slow to start. Jobs that would ordinarily start within a minute can now take up to a few minutes before they start. We are monitoring the situation. Aug 14, 08:54 - Aug 15, 13:07

Updates
SDU/K8s downtime
An issue has been resolved leading to a crash is under investigation Aug 13, 17:30 - Aug 13, 17:55

Updates
Syncthing temporarily disabled
Syncthing has temporarily been disabled while we investigate an issue with the system. We do not believe that the system will be re-enabled today (13/08/25). We hope to have an update with ETA or fix tomorrow (14/08/25). Aug 13, 15:58 - Aug 15, 13:07

Updates
Syncthing has been re-enabled. We will monitor the situation. If instability is reintroduced then we may need to disable it again. Status updates will be posted here if this becomes needed. Aug 15, 13:08 We are currently testing a fix internally and hope to deploy the fix tomorrow (15/08/25). Aug 14, 14:08
Jobs unable to start
We are investigating an issue related to jobs not starting. Aug 13, 14:52 - Aug 13, 15:58

Updates
The is no longer present. We will continue to monitor the system. Aug 13, 15:58
SDU/K8s maintenance
UPDATE: The maintenance has been moved from the 12th of August to the 13th of August. The SDU/K8s (DeiC Interactive HPC, SDU) service provider will be down on the 13th of August (13/08/2025). All jobs will be killed prior to the maintenance and will not be automatically restarted. The maintenance is expected to take place between 08:00 and 16:00. Aug 13, 08:00 - Aug 13, 13:45

Updates
Maintenance has been completed. We will be monitoring the system over the coming hours and days. Aug 13, 13:45 Preparation phase of the maintenance took longer than expected. The primary work has started now and the provider is no longer accessible as noted in the original maintenance notice. Aug 13, 09:56
UCloud interactive interfaces not working
We are investigating an issue which causes the "Open interface" button to not appear on the SDU/K8s provider. Aug 8, 08:50 - Aug 8, 09:03

Updates
The issue has been resolved. Aug 8, 09:03
UCloud is experiencing issues
We are looking into issues with UCloud, which causes most funtionality to not be working. Aug 7, 21:15 - Aug 8, 08:50

Updates
The problem has been identified and the system is operational again. Aug 8, 08:45
AAU/K8s maintenance
The AAU/K8s (DeiC Interactive HPC, AAU) service provider will be down on the 5th of August (05/08/2025). All jobs will be killed prior to the maintenance and will not be automatically restarted. The maintenance is expected to take place between 08:00 and 16:00. Aug 5, 08:00 - Aug 5, 10:29

Updates
Maintenance has completed and the system is available. We are monitoring the system for possible bugs. Aug 5, 10:29

July, 2025

Short network outage
This morning we had an issue with the internal network, which caused disconnects across the entire infrastructure for around 15 minutes. We are monitoring the system for related issues. Jul 9, 08:43 - Jul 9, 16:00

Updates
We believe most things should be working again, but we are still monitoring the system for potential issues. Jul 9, 11:53 It seems there are still some services that have not properly recovered. Jul 9, 10:20

June, 2025

Slow response times
We are currently investigating an issue causing slow response times. Jun 24, 12:52 - Jun 24, 13:17

Updates
The issue has been resolved. Jun 24, 13:17
UCloud jobs might be slower than normal
On Monday evening (June 23rd) a machine in our Ceph storage system failed. This is affecting u1-standard and u1-fat nodes, where jobs might run slower than usual, because the storage system is under heavy load while rebuilding data redundancy. Jun 23, 22:15 - Jul 16, 11:42

Updates
Issues with H100 GPU node
We are experiencing hardware issues with one of the u3-gpu nodes on UCloud, resulting in the machine sometimes powering off. We are monitoring the situation. Jun 16, 07:48 - Aug 14, 15:00

Updates
The machine has been repaired and returned to production. Aug 14, 15:00 The issue persists and the machine has been taken out of production until the problem has been solved. Jul 3, 09:30
UCloud apps interface not working
This morning we had an issue where accessing the interface of UCloud apps simply resulted in "Not Found". The problem should now be solved. Jun 11, 06:00 - Jun 11, 07:57

Updates
Internal error when submitting jobs
UCloud has experienced issues with job creation beginning around Friday 23:28. This has likely caused some jobs to be terminated with an incorrect "insufficient funds" message. The issue has now been resolved and we are monitoring the situation. Jun 6, 23:28 - Jun 8, 10:30

Updates
Unexpected machine reboots
This morning 15 machines rebooted in the UCloud server room due to a fluctuation in the power grid. All machines are running again, but jobs running on the machines were terminated. Jun 2, 06:25 - Jun 2, 06:30

Updates

Extraordinary maintenance

Updates

Issues with CPU nodes

Updates

Maintenance window from our ISP

Updates

Power loss at SDU

Updates

Internet connection down

Updates

Extended UCloud downtime

Updates

Background tasks terminated for maintenance

Updates

UCloud downtime

Updates

Problems with Nvidia H100 cards

Updates

Unable to attach public IPs

Updates

Downtime for all services

Updates

Downtime for all services

Updates

Hardware problems for u2-gpu

Updates

UCloud is experiencing issues

Updates

UCloud partially unavailable

Updates

UCloud has been updated

Updates

UCloud jobs will be terminated

Updates

SDU/K8s unavailable

Updates

UCloud is being restarted for an update

Updates

Two H100 nodes rebooted

Updates

Compute nodes unresponsive

Updates

Power fluctuation caused machines to reboot

Updates

Issues with nodea0-19

Updates

UCloud degraded

Updates

Jobs unable to start on SDU/K8s

Updates

'Type 1 - KU' storage allocations unavailable

Updates

SDU/K8s usage tracked incorrectly for storage

Updates

SDU/K8s some jobs are slow to start

Updates

SDU/K8s downtime

Updates

Syncthing temporarily disabled

Updates

Jobs unable to start

Updates

SDU/K8s maintenance

Updates

UCloud interactive interfaces not working

Updates

UCloud is experiencing issues

Updates

AAU/K8s maintenance

Updates

Short network outage

Updates

Slow response times

Updates

UCloud jobs might be slower than normal

Updates

Issues with H100 GPU node

Updates

UCloud apps interface not working

Updates