SysEleven Status and Incidents https://syseleven-status.de Get all incidents by feed http://www.rssboard.org/rss-specification python-feedgen https://www.syseleven.de/wp-content/uploads/2020/10/SysEleven_XL_Logo_quer_RGB.png SysEleven Status and Incidents https://syseleven-status.de de Thu, 01 May 2025 00:06:07 +0000 INCIDENT: SysEleven STACK API issues, region FES <p>Affected Components: <strong>SysEleven Stack API, region FES</strong></p> <p>Incident Start: <strong>2024-12-04 19:07 UTC+01:00 (CET)</strong></p> <p>Incident End: <strong>2024-12-04 20:10 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <ul> <li>Accessibility of the SysEleven Stack API is not ensured.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Requests on OpenStack API may return an error code</li> <li>Spawning new virtual machines (VMs) or changing existing resources may fail</li> </ul> <hr /> <p><strong>Update: 2024-12-04 20:00 UTC+01:00 (CET)</strong></p> <p>We identified the likely root cause and a fix is being applied.</p> <hr /> <p><strong>Update: 2024-12-04 20:10 UTC+01:00 (CET)</strong></p> <p>OpenStack API is now working again as expected.</p> 546 Wed, 04 Dec 2024 17:07:00 +0000 INCIDENT: SysEleven STACK issues in region dus2 <p>Affected Components: <strong>SysEleven Stack, region dus2</strong></p> <p>Incident Start: **2024-12-06 17:45 UTC+01:00 (CET)</p> <hr /> <p>Description:</p> <ul> <li>We are seeing some network connectivity issues in the region and are investigating.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is degraded for MetaKube Services in dus2 </li> <li>Connectivity to Database as a Service in dus2 is degraded</li> </ul> <hr /> <p><strong>Update: 2024-12-06 18:45 UTC+01:00 (CET)</strong></p> <ul> <li>We identified an issue with one of our gateways and are working on a fix</li> </ul> 547 Fri, 06 Dec 2024 15:45:00 +0000 INCIDENT: Partial outage of Control Planes in Regions FES, DBL, CBK <p>Affected Components: <strong>Control Planes in Regions FES, DBL, CBK</strong></p> <p>Incident Start: <strong>2024-12-09 18:30 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>Some cluster control planes are not reachable.</p> <hr /> <p>Update (22:20 CET):</p> <p>We have found a way to mitigate the issues temporarily and start applying the fix.</p> <p>Update (22:50 CET):</p> <p>We applied the fix everywhere. We don't see any broken clusters anymore.</p> <hr /> <p>Update <strong>2024-12-10 12:30 UTC+01:00 (CET)</strong>:</p> <p>We identified the root cause: An unintended side-effect of an upgrade to a kube-proxy setting changed the proxy mode. This created iptables rules which were not cleaned up and through which traffic was dropped.</p> <p>We are taking measures to prevent this in the future.</p> 548 Mon, 09 Dec 2024 16:30:00 +0000 INCIDENT: SysEleven STACK network issues in region DBL <p>Affected Components: <strong>SysEleven Stack, region DBL</strong></p> <p>Incident Start: <strong>2024-12-13 14:09 UTC+01:00 (CET)</strong></p> <p>Incident End: <strong>2024-12-13 16:09 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <ul> <li>DBL network has performance issue</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Network performance degradation </li> </ul> <hr /> <p><strong>Update: 2024-12-13 15:17 UTC+01:00 (CET)</strong></p> <p>Situation is back to normal.</p> <hr /> <p><strong>Update: 2024-12-13 15:57 UTC+01:00 (CET)</strong></p> <p>We notice performance degradation again.</p> <hr /> <p><strong>Update: 2024-12-13 16:09 UTC+01:00 (CET)</strong></p> <p>Situation is back to normal.</p> 549 Fri, 13 Dec 2024 12:09:00 +0000 INCIDENT: SysEleven STACK issues in region DBL <p>Affected Components: <strong>SysEleven Stack, region DBL</strong></p> <p>Incident Start: <strong>2024-18-12 00:47</strong> Incident Start: <strong>2024-18-12 01:27</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is restricted</li> </ul> <hr /> <p>Update: <strong>2024-18-12 01:27</strong></p> <ul> <li>We could see problems with cross region connectivity outgoing from the DBL region, between 00:35 - 01:10, at the moment traffic seems to have normalized again, we are still investigating</li> </ul> <hr /> <p>Update: <strong>2024-18-12 02:00</strong></p> <ul> <li>Device causing the network issues was identified, root cause will be further investigated</li> </ul> 550 Tue, 17 Dec 2024 22:40:00 +0000 INCIDENT: SysEleven STACK issues in region CBK <p>Affected Components: <strong>SysEleven Stack, region CBK</strong></p> <p>Incident Start: <strong>2025-06-01 08:18</strong></p> <p>Incident Start: <strong>2025-06-01 08:50</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is restricted</li> </ul> <hr /> 551 Mon, 06 Jan 2025 08:35:12 +0000 INCIDENT: SysEleven STACK issues in region CBK <p>Affected Components: <strong>SysEleven Stack, region CBK</strong></p> <p>Incident Start: <strong>2025-06-01 10:00</strong></p> <p>Incident End: <strong>2025-06-01 10:30</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is restricted</li> </ul> <hr /> 552 Mon, 06 Jan 2025 10:14:32 +0000 INCIDENT: SysEleven STACK issues in region CBK <p>Affected Components: <strong>SysEleven Stack, region CBK</strong></p> <p>Incident Start: <strong>2025-01-07 09:16</strong></p> <p>Incident End: <strong>2025-01-07 09:45</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is restricted</li> </ul> <hr /> <p>Update : 09:45</p> <p>Situation stabilized again, we are investigating the situation at the moment</p> 553 Tue, 07 Jan 2025 09:25:38 +0000 INCIDENT: minor outage of Database as a Service <p>Affected Components: Database as a Service, all regions</p> <p>Incident Start: <strong>2025-01-21 10:30 UTC+01:00 (CET)</strong> Incident End: <strong>2025-01-21 14:45 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>Database as a Service is currently only available via the API and terraform, the UI is not working</p> <hr /> <p>Update <strong>2025-01-21 14:45</strong></p> <p>We fixed the underlying issue, therefore Database as a Service is also working again in the UI</p> 554 Tue, 21 Jan 2025 14:41:37 +0000 INCIDENT: MetaKube Control Plane issues, region FES <p>Affected Components: <strong>MetaKube Control Planes, region FES</strong></p> <p>Incident Start: <strong>2025-02-11 06:00 UTC+01:00 (CET)</strong></p> <p><strong>State: Resolved</strong></p> <hr /> <p>Description:</p> <ul> <li>Accessibility of the MetaKube API is not ensured.</li> <li>After a scheduled maintenance to the network in FES, the MetaKube control cluster (which is hosting the customer control planes) has problems reaching DNS. This is causing issues to the customer control planes.</li> <li>This also affects Database as a Service and Observability as a Service</li> <li>All times below are CET</li> </ul> <p><strong>UPDATE 2025-02-13 12:20</strong> - We consider all service disruptions of the incident to be mitigated - Although we are not expecting any more service disruptions, we are still watching all systems closly</p> <p><strong>Previous Updates in reverse chronological order</strong></p> <p><strong>UPDATE 2025-02-11 10:00</strong></p> <ul> <li>Still investigating the DNS issue. We sent out a notifier to all potentially affected customers.</li> </ul> <hr /> <p><strong>UPDATE 2025-02-11 11:15</strong> - We have used the time since the last update to narrow down the root cause of the incident. We excluded some possibilites but did not find the root-cause. We are now preparing a partial rollback to downgrade the SDN again.</p> <hr /> <p><strong>UPDATE 2025-02-11 12:05</strong> - We completed a partial OVN/SDN downgrade, however this has not yet resolved the incident. - We are investigating further</p> <hr /> <p><strong>UPDATE 2025-02-11 12:25</strong> - We’re exploring further downgrade approaches (previous rollbacks were, as announced, partial) and are in parallel investigating further.</p> <hr /> <p><strong>UPDATE 2025-02-11 13:00</strong> - As we originally updated the SDN due to a critical security gap, it was decided that we will not perform a full OVN/SDN rollback to the initial state. - We have now activated several teams who will be developing and evaluating different solutions until 1.30 pm. An update on how we proceed will follow then.</p> <hr /> <p><strong>UPDATE 2025-02-11 13:50</strong> - Our Teams will continue developing and evaluating solutions in break out session as there are further leads but no breakthrough, yet. - In parallel we are preparing a failover for IAM and Alloy to DUS/HAM</p> <hr /> <p><strong>UPDATE 2025-02-11 15:30</strong> - Our teams investigation in SDN traffic loss is ongoing - Our teams continue developing and evaluating solutions and possible workarounds - In parallel we are evaluating a rebuild of the SDN (software defined network)</p> <hr /> <p><strong>UPDATE 2025-02-11 17:47</strong> - Part of the services are still not functional - We are still working hard to resolve the issues but we will roll back the update of the SDN if no progress is made. - The planned maintenance period begins today, 11 February 2025, at 23:00 and is expected to last until around 06:00 CET on 12 February 2025, during which time there may be repeated interruptions to services. - The maintenance is also announced via notifier. You will get an info of the end of the maintenance also via notifier.</p> <hr /> <p><strong>UPDATE 2025-02-11 20:15</strong> - IAM, Alloy and Observability as a Service are restored to full functionality - The Database as a Service API has also been restored, but the API still has some issues which are related to the wider SDN problem. The Databases themselves were at no point affected by the incident.</p> <hr /> <p><strong>UPDATE 2025-02-11 23:00</strong> - The situation with the API improved. We will continue watching it.</p> <hr /> <p><strong>UPDATE 2025-02-12 10:15</strong> - The previous maintenance work did not achieve the desired success. - Workarounds have been implemented, so operations should be able to continue without disruptions. - Maintainance work will continue during the upcoming night to fully resolve the incident.</p> <hr /> <p><strong>UPDATE 2025-02-12 14:55</strong> - We scheduled another maintenance window for this night, February 12th, from 11:00 PM to 6:00 AM the following day. - During this maintenance window, there may be brief interruptions or limited availability of certain services. - The goal is the complete resolution of the incident caused by Monday's update. - An RfO will be available in our Helpdesk after the incident is mitigated.</p> <hr /> <p><strong>UPDATE 2025-02-13 12:20</strong> - We consider all service disruptions of the incident to be mitigated - Although we are not expecting any more service disruptions, we are still watching all systems closly</p> <hr /> <p><strong>UPDATE 2025-02-13 15:30</strong> - Incident is resolved</p> 555 Tue, 11 Feb 2025 08:04:01 +0000 INCIDENT: SysEleven STACK issues in region FES <p>Affected Components: <strong>SysEleven Stack, region FES</strong></p> <p>Incident Start: <strong>2025-03-08 13:00</strong></p> <p>Incident End: <strong>2025-03-08 15:30</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>IPv6 Connectivity is restricted</li> </ul> <hr /> <p><strong>Update: 16:25 </strong></p> <p>Flaky ipv6 connectivity was repaired around 15:30, a mismatch in configuration was identified</p> 556 Sat, 08 Mar 2025 14:29:22 +0000 INCIDENT: SysEleven STACK API issues, region CBK <p>Affected Components: <strong>SysEleven Stack API, region CBK</strong></p> <p>Incident Start: <strong>2025-03-13 14:30</strong> Incident End: <strong>2025-03-13 17:15</strong></p> <hr /> <p>Description:</p> <ul> <li>Accessibility of the SysEleven Stack API is not ensured.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Spawning new virtual machines (VMs) or changing existing resources is not possible.</li> </ul> 557 Thu, 13 Mar 2025 14:43:01 +0000 INCIDENT: Partial degradation of SysEleven IAM services <p>We're currently investigating an issue in our IAM systems that prevents creation and deletion of projects.</p> 558 Mon, 17 Mar 2025 10:39:16 +0000 INCIDENT: all platforms <p>Affected Components: <strong>All regions</strong></p> <p>Incident Start: <strong>2025-04-10 23:40 (CETS)</strong></p> <p>Incident End: <strong>2025-04-11 01:08 (CETS)</strong></p> <hr /> <p>Description:</p> <p>We have major outage in all regions</p> <hr /> <p><strong>Update: 01:01 am</strong></p> <p>Issue identified and we are working on it.</p> <hr /> <p><strong>Update: 01:08 am</strong></p> <p>Issue is resolved and we are whatching it.</p> 560 Fri, 11 Apr 2025 00:22:15 +0000 INCIDENT: SysEleven STACK Object Storage issues, region FES, HAM1, DUS2 <p>Affected Components: <strong>SysEleven Stack Object Storage, region FES, DUS2, HAM1</strong></p> <p>Incident Start: <strong>2025-04-14 16:26 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>At the moment we are facing issues with the Object Storage in Region FES, DUS2, HAM1.</p> <hr /> <p>Customer Impact:</p> <ul> <li>Some objects might be corrupted regarding latest known issues in Ceph. https://docs.ceph.com/en/latest/releases/squid/#v19-2-2-squid</li> <li>We are investigating the situation and working on updates to mitigate the issues</li> </ul> <hr /> <p><strong>Update: 18:40</strong></p> <ul> <li>The Ceph S3 endpoint of the DUS2 region is currently being updated, after successful testing we will proceed with the HAM1 and FES region</li> </ul> <p><strong>Update: 19:45</strong></p> <ul> <li>The update for the DUS2 region went through without problems, we are proceeding to update HAM1 and the FES region</li> <li>Affected customers will be contacted individually once we got a complete overview of potential affected buckets and objects</li> </ul> <p><strong>Update: 21:10</strong></p> <ul> <li>Update of HAM1 region also finished, FES region is in progress</li> </ul> <p><strong>Update: 22:52</strong></p> <ul> <li>Update of FES region is still ongoing</li> </ul> <p><strong>Update: 15-04-2025 09:36</strong></p> <ul> <li>Update of FES region is still ongoing and everything works as intended</li> </ul> <p><strong>Update: 16:40</strong></p> <ul> <li>Update of FES region is expected to be finished to the end of the business day</li> </ul> <p><strong>Update: 16-04-2025 03:08</strong></p> <ul> <li>Update of FES is over</li> </ul> 561 Mon, 14 Apr 2025 16:51:56 +0000 INCIDENT: SysEleven STACK issues in region DBL <p>Affected Components: <strong>SysEleven Stack, region DBL</strong></p> <p>Incident Start: <strong>2025-04-17 10:00</strong></p> <p>Incident End: <strong>2025-04-17 10:45</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <p>Update [2025-04-17 10:28 am]: Alloy (GUI) &amp; API Authentication HAM &amp; DUS, Metakube API in DBL &amp; CBK are also affected</p> <p>Update [2025-04-17 10:40 am]: DBL Cloud Gateways have been identified as possible source of malfunction, we are bringing them back to function, first services are back online, next Update at 11:00 am</p> <p>Update [2025-04-17 11:00 am]: All services are up and running, we are still watching all environments, incident is declared as ended</p> 562 Thu, 17 Apr 2025 10:11:22 +0000 INCIDENT: SysEleven STACK issues in region CBK <p>Affected Components: <strong>SysEleven Stack, region CBK</strong></p> <h2>Incident Start: 2025-04-25 15:55:00 UTC+01:00 (CET)</h2> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>We are expirencing a partial outage of the Network where some connection might get dropped.</li> </ul> 563 Fri, 25 Apr 2025 17:18:12 +0000 INCIDENT: SysEleven Managed Hosting Network partial outage, region BLU1 <p>Affected Components: SysEleven Managed Hosting Network partial outage, region BLU1</p> <p>Incident Start: 2025-04-25 15:55:00 UTC+01:00 (CET)</p> <p>Description: - Occurring errors are currently being investigated.</p> <p>Customer Impact: - We are expirencing a partial outage of the Network where some connection might get dropped.</p> 564 Fri, 25 Apr 2025 17:46:54 +0000 INCIDENT: SysEleven STACK issues in region CBK (Network) <p>Affected Components: <strong>SysEleven Stack Network, region CBK</strong></p> <p>Incident Start: <strong>2025-04-29 17:30 UTC+02:00 (CEST)</strong> Incident End: <strong>2025-04-29 17:50 UTC+02:00 (CEST)</strong></p> <hr /> <p>Description:</p> <ul> <li>Network connectivity to cloud region CBK was impaired</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is restricted</li> </ul> <hr /> <p><strong>Update: 17:59</strong></p> <p>Network connectivity issues were resolved around 17:50</p> 565 Tue, 29 Apr 2025 18:01:27 +0000 INCIDENT: Partial outage of Metakube control plane in DUS <p>Affected Components: <strong>metakube controle plane, region DUS</strong></p> <p>Incident Start: <strong>2025-04-30 16:00 UTC+02:00</strong></p> <hr /> <p>Description:</p> <p>The loadbalancer for the kubernetes apiserver does not properly work for some clusters.</p> <hr /> <p>Customer Impact: - Potential connection loss to the apiserver</p> <hr /> <p>Update: <strong>2025-04-30 16:30 UTC+02:00</strong></p> <p>We have recovered the affected clusters but haven't found the root cause. Right now we can't see any other affected clusters but will continue to monitor the situation.</p> <p>Update: <strong>2025-04-30 16:52 UTC+02:00</strong> We have identified and fixed the root-cause of the incident. All clusters are working properly again.</p> 566 Wed, 30 Apr 2025 16:10:27 +0000