ProLion’s ClusterLion solution intervenes to prevent data loss and downtime after air conditioning failure in German university’s data centre goes undetected
The University of Bonn in Germany avoided a significant IT outage earlier this year after ClusterLion, the high-availability solution for managing business applications, systems and data developed by ransomware protection and data integrity solutions provider ProLion, detected a power outage in one of the university’s data centres and initiated an automated failover to a second data centre. The failover kept the university’s systems up and running and prevented large-scale data corruption and loss.
One Sunday in spring of this year, the air conditioning system in one of the University of Bonn’s data centres broke down. The breakdown went undetected, and the temperature quickly rose to a critical level that caused the data centre’s power supply to fail.
The shutdown directly affected the NetApp MetroCluster of servers housed in the data centre. Typically, if there is an outage in a data centre hosting a MetroCluster, a switchover needs to be immediately initiated to keep the servers and systems running. Without the switchover, the servers are unavailable.
However, even with synchronous data mirroring, there is no guarantee that an automatic switchover would prevent what is called ‘split-brain syndrome’ upon a restart. Split-brain syndrome occurs when a cluster of nodes is divided into smaller clusters of equal numbers of nodes, each of which believes it is the only active cluster. This can often result in data corruption or loss.
ProLion’s ClusterLion solution solves this problem as it has complete infrastructure independence and ensures that all systems stay ‘always-on’, without any downtime.
At the University of Bonn, ClusterLion prevented a full system outage by immediately detecting the power failure and within a few seconds initiating an automatic failover to a second data centre. As a result, all data and services remained online and the university’s staff and students could continue to use the network without interruption.
When the air conditioning system had been repaired, all systems in the affected data centre were rebooted. ClusterLion ensured that the electricity supply remained interrupted until repairs were completed and a premature reboot of the affected cluster was avoided.
Commenting on the incident, Alexander Meyer, System Integration Specialist at the University Bonn said, “Thanks to the fully automatic intervention of ClusterLion, a massive amount of downtime, remediation work, data loss and therefore also numerous extra hours by the storage team was avoided.”
The ClusterLion solution supports the provision of always-on availability by initiating an automatic storage switchover in the event of an outage in NetApp MetroCluster environments. As a result, services automatically switch to a different data centre and thereby constant data availability is guaranteed. In the event of a failure, ClusterLion powers off the affected side to avoid split-brain syndrome and then automatically initiates a switchover to the other side.
Robert Graf, CEO at ProLion said, “Organisations everywhere are aware of the growing frequency and range of unexpected disruptive events that can potentially bring down their data centres and paralyse their systems – that’s why we’ve seen a rise in interest and demand recently for ClusterLion and our other continuity solutions for ONTAP storage-based data centres.
“As this incident shows, cyber attacks are not the only threat that organizations must prepare for,” continued Graf. “Power outages caused by phenomena such as heatwaves, flash flooding or energy blackouts are all risks that can cause system failure. In these increasingly uncertain times, it’s incumbent upon organizations to put in place effective, fast-acting counter-measures to mitigate the damage and disruption from these events with proactive automated data integrity solutions.”
ProLion GmbH is a developer of ransomware protection and data integrity software solutions for any ONTAP focused storage environment and high-availability solutions for SAP and MetroCluster environments.
Founded in Austria, ProLion’s best-of-breed CryptoSpike solution eliminates system downtime and data loss risk and ensures that an organizations’ data remains secure, compliant, manageable and accessible.