All insights

The CrowdStrike Outage: A Wake-Up Call for Data Reliability, Availability and Resiliency

Octavian Tanase Octavian Tanase
Chief Product Officer

July 26, 2024


A global IT outage, triggered by a faulty software update from cybersecurity firm CrowdStrike, brought critical infrastructure to a standstill on Friday.

Today I am writing to you for two reasons: First of all, I want to assure our customers that Hitachi Vantara has not been impacted by the ongoing Microsoft Windows Blue Screen of Death (BSOD) outage, and our business operations (including our ITaaS and hybrid cloud offerings) continue to run smoothly; and we will continue to monitor them to ensure they remain uninterrupted. Secondly, however problematic this event may be, on a positive note it has the potential to serve as a wake-up call for data reliability, availability and resilience. Read on:

The disruption, potentially surpassing $1 Billion in costs, affecting businesses worldwide, underscored the fragility of our interconnected digital world and the risks associated with relying heavily on centralized cloud services. While the root cause of the outage was a technical glitch rather than a cyberattack, it exposed the potential consequences of service disruptions on business operations.

The High Cost of Downtime

The impact of the CrowdStrike outage was far-reaching, causing numerous organizations to experience significant operational challenges and highlighting the financial implications of downtime, including lost revenue, customer churn, and damage to brand reputation. According to Pingdom, the average cost of IT downtime is $100,000 per hour. Moreover, the outage underscores the potential legal and regulatory risks associated with data inaccessibility, especially for industries subject to strict compliance standards, such as banking and financial services, healthcare, transportation and energy, airlines and insurance. Beyond the immediate impact, these outages can have a lasting effect on brand reputation, as a study by Ponemon Institute found that the average cost of a data breach to a company's reputation is $7.2 million.

Watch Our Webinar: Beyond Unbreakable? Delivering High Data Center Availability, to learn more about how to mitigate unpredictable data center failures and deliver true high availability.

The Need for Robust Data Protection Strategies

To mitigate the risks associated with service disruptions, organizations must prioritize data availability and resiliency. Here are key strategies to consider:

  • Hybrid Cloud and Multicloud Strategies: Adopting a hybrid or multicloud approach can significantly enhance resiliency and availability. By distributing workloads across multiple cloud platforms and on-premises infrastructure, organizations can reduce their reliance on any single environment. This diversification helps mitigate the impact of outages and ensures business continuity.
  • Disaster Recovery Planning: A comprehensive disaster recovery plan outlines the steps to be taken in the event of a service disruption. Rapid recovery is paramount to minimizing business impact. Detailed recovery procedures, including data restoration and system reboot, should be meticulously outlined and tested regularly.
  • Data Replication and Backup: Implementing robust data replication and backup procedures is essential for ensuring data accessibility in the event of an outage. Multiple copies of data should be stored in geographically dispersed locations to minimize the risk of data loss.
  • Cloud Service Provider Evaluation: Organizations should carefully evaluate the reliability and performance of their cloud service providers. It is essential to choose providers with a strong track record of uptime and disaster recovery capabilities.
  • Data Loss Prevention (DLP): Implementing DLP solutions can help protect sensitive data from unauthorized access, loss, or corruption. These solutions can also assist in data recovery efforts.

Building a Resilient Data Infrastructure

While the CrowdStrike outage was a significant event, it also presents an opportunity for organizations to strengthen their data protection and recovery capabilities. By investing in robust data management strategies and building a resilient infrastructure, businesses can better withstand future disruptions and minimize the impact on operations.

AI can significantly enhance infrastructure resilience. By analyzing vast datasets, AI can predict failures, optimize resource allocation, and detect anomalies. In the case of CrowdStrike, AI could have potentially identified patterns indicating a software issue before it caused widespread disruption.

It is important to note that data availability and resiliency are ongoing processes. Regular testing and updates to disaster recovery plans are essential to ensure their effectiveness. Additionally, organizations should stay informed about emerging threats and vulnerabilities to proactively address potential risks.

The CrowdStrike outage serves as a powerful reminder of the critical role that data plays in modern business operations. By prioritizing data availability and resiliency, organizations can build a stronger foundation for future success.

DZ BANK: Cloud-like Economics and High Availability

DZ BANK, a leading financial institution in Germany, faced the challenge of managing and scaling mission-critical data storage while optimizing costs. The bank required a solution that could handle the demands of its high-performance trading applications while providing cloud-like flexibility and cost efficiency.

To address these challenges, DZ BANK is transitioning to a hybrid cloud strategy, moving workloads dynamically as needed between on-premises infrastructure and the cloud. As part of its optimization strategy, they consolidated storage systems, and standardized data infrastructure, simplified the architecture, and partnered with Hitachi Vantara to ensure the highest availability and performance for their mission-critical trading applications. The result: cloud-like economics with high availability.

A Continuous Journey, not a Destination

Building a resilient data infrastructure is an ongoing process, not a one-time achievement. It requires a holistic approach that blends technology, strategy, and human expertise. By leveraging AI and hybrid cloud infrastructure, organizations can proactively defend against evolving threats and protect their valuable data assets. True resilience lies in the constant pursuit of improvement, adaptation, and vigilance, recognizing that the threat landscape is always changing.

With a continued focus on the need for business continuity across industries, it's essential for business leaders to consider a multi-layered approach to data reliability, availability and resiliency, including on-premises, cloud, and hybrid solutions, along with robust disaster recovery planning.

Explore how you can achieve a flexible hybrid cloud ecosystem with high availability that meets your needs now – and in the future.

ADDITIONAL RESOURCES


Octavian Tanase

Octavian Tanase

Octavian is a University of California, Berkeley graduate and resides in the San Francisco Bay Area.