We are currently facing a service failure in our central storage appliance in our datacenter Berlin Kitzingstraße. This leads to problems in access to NFS shares. We are working on a solution with high pressure.
Storage Cluster seems to have switched off due to a power outage. Power supply is fixed and nodes are booting up.
The cluster could not yet be recovered completely. We are still working on the solution.
We're still working on the complete recovery of all cluster nodes.
The functionality of the Isilon Storage Cluster could not yet be fully restored. After 2 of the 6 cluster nodes could not be started successfully, the cluster was initially not writable. In the meantime, we are observing successful write operations in the cluster and are taking further measures to further stabilize the status by remounting the shares.
While we continue to work with the manufacturer's support to restore functionality, we have activated the emergency plan and are preparing to move the clients to the second Isilon storage cluster at the Berlin Lützowstraße site. There, the shares are available with a daily synchronized status, i.e. up to 24h old. We try to synchronize the changes from the original cluster or, as an alternative, provide the old share as an additional read-only mountpoint.
If the timely recovery of the functionality is hopeless, we will start pivoting the shares soon, without additional announcement.
Systems are beginning to recover. We are still working with high pressure to permanently fix the problem.
We are currently preparing the announced move to the cluster in the second data center.
At 19:50, we started moving customer by customer to the cluster in the other data center. We do this one at a time so that the load on the cluster does not increase to fast.
We have restored all systems. At the moment, rework is still in progress.
We were able to restore functionality for the most part by migrating the shares to the replacement system. Where migration did not take place, we contacted you directly. According to our monitoring systems, all services have been restored.
If you still experience problems, please contact us.
We regret the outage and are continuing to investigate the exact cause and may contact you to take further action.
We monitored the situation in detail overnight and were able to observe stable operations. We will continue to analyze the cause and the effects and contact our affected customers with detailed error analysis.
If you have any questions or unexpected behavior occurs in your setup, please directly contact our support.