A programmatic approach to optimising IT resilience

Digital Quality
-
8 min
Digital Quality
/
A programmatic approach to optimising IT resilience

The need for IT resiliency is growing. So more infrastructure and operations (I&O) leaders are incorporating resilience initiatives into their workflows for better system outcomes. That results in more successful business ecosystems that can recover from and withstand disturbances such as natural disasters, cyberattacks, insider threats, downtime, and unavailability.

We take a deep dive into the programmatic approach to resiliency below. With better resilience planning, you can improve preparedness in a multi-stakeholder environment and get more value from resilient infrastructure. Much of the information in this article is taken from the recent Gartner report, “7 Tips for Improving Reliability, Tolerability, and Disaster Recovery.” However, we also explain the benefits of working with a technology consultancy like Adservio when choosing the best solutions for building resilience in your business.

What is IT resiliency and its challenges?

Gartner notes that “resilience” means different things to different I&O teams. With no universal definition, it’s difficult for team leaders to initiate resilience improvement efforts.

Many teams also don’t know where to begin with a programmatic approach for optimizing IT resiliency. That can result in bottlenecks during the resilience lifecycle. For example, leaders often see resilience as solely an IT problem that doesn’t concern DevOps, development processes, sustainability, risk management, risk assessment, software supply chain management, recovery planning, variability, and other workflows.

Most teams also don’t know how much resilience is "enough." That can impact a company’s roadmap to success and how that roadmap provides value to various stakeholders in the enterprise.

7 tips for integrating the programmatic approach to resilience in your organization

To overcome the challenges above, you must take a programmatic approach to IT resilience. Gartner has compiled seven tips for implementing resilience requirements in its report, which we have summarized below.

1. Get support

The first step to resilience success is gaining IT resilience support from different departments. There are two approaches to doing this. The first is the Carrot Approach, which purposely avoids leading with risk. Instead, I&O leaders focus on “quick wins” and help their organizations grow by enhancing quality, productivity, and effectiveness. There’s also a focus on iterative improvement and building buy-in and credibility by creating co-created risk narratives. These narratives include business value narratives (BVNs).

The second approach is the Stick Approach. That's when I&O leaders work with audit teams to compel support for their IT teams wanting to enhance resilience. Gartner notes that this approach is more short-lived than Carrot and involves plenty of box-ticking exercises. Such exercises slow down resilience execution.

2. Integrate adaptable production readiness into toolchains

Adaptable production readiness is the process of deploying systems and apps in production environments. Improvements in this area can provide positive results across various IT resilience elements. That leads to what Gartner describes as a “significant payoff for most organizations.”

Also, developing a high-level IT resilience taxonomy results in more tolerable, reliable, and recovery resilience processes. For example, degree systems can remain secure and performant and optimize service-level objectives (SLOs). Degree systems can also help you manage IT hazards in pipelines. That leads to more successful resilience actions and outputs.

3. Get past “disaster recovery theater”

Gartner describes disaster recovery theater as:

“Teams going through the motions in creating plans and play-acting through exercises but not truly being prepared for even basic scenarios.”

Those basic scenarios include long-term planning for resilience and choosing the right management systems for this process. That’s something we can certainly help with. For instance, our technical consultants will recommend the best resilience solutions based on your use case and mandates. That results in better multi-stakeholder engagement and a more successful resilience strategy.

4. Cybersecurity resilience

Cybercrime is still growing. Experts predict the cost related to this crime will hit $8 trillion in 2023 and $10.5 trillion by 2025. So cybersecurity resilience preparedness is a must for organizations like yours. That involves using systems that can identify, withstand, and recover from cyberattacks and other critical events.

Gartner says:

“[Cybersecurity resilience preparedness] is more than the security team’s problem and requires additional levels of thoughtfulness regarding recovery. All data centers, systems, and data may be “up” but not accessible, so there’s nothing to failover to. And backup systems used to restore may themselves be compromised.”

5. Performance monitoring and topology mapping

Gartner also says the importance of basic monitoring and enhancing tolerability should be obvious for I&O leaders. Plus, faster detection of performance issues will result in quicker response times and better mean time to repair (MTTR).

Leveraging topology mapping capabilities by tapping into existing monitoring tools can help. That lets you build up any tactical monitoring workflows in your organization and detect, investigate, analyze, and respond to IT hazards in a quicker timeframe. If you don’t have appropriate monitoring tools or need to explore additional capabilities, we can help. Just reach out to learn more about working with one of our technology consultants.

6. Address SaaS

If you specialize in SaaS products, a sudden bankruptcy or tech failure could close down operations. The problem is that the SaaS provider you work for doesn’t always provide recovery assurances. Unfortunately, that can result in inconsistent messaging and development plans if your department is responsible for ensuring these measures.

Gartner recommends several solutions to address these challenges. These solutions include determining minimum viable business processes (MVBPs) and aligning availability, RPO, RTO, and tolerable workarounds.

7. Using the right resilience metrics

Determining and implementing various metrics for resilience can enhance transparency and accountability in different areas. These areas include active IT hazard recovery, resilience prioritization, resilience risk assessment, and decision-making.

You can also generate valuable insights about telecommunications, sustainable development, information systems, disaster risk, risk reduction, and mitigation with an effective program incorporating resilience metrics. It doesn’t matter if you work in the public or private sector. The right metrics ultimately improve all facets of resilience, including community resilience, resilience enhancement, and resource allocation.

How we can help

IT resilience is critical. It’s that simple. You can improve the end-user experience and ensure systems withstand and recover from disturbances just by integrating resiliency workflows in your organization. Follow Gartner’s seven tips above to successfully incorporate a programmatic approach to resilience.

You’ll need the right tools and technologies to successfully execute these tips, including an observability platform for system monitoring. That’s where Adservio comes in. Our experience and expertise will help you find the right tools and vendors for resilience, improving the overall well-being of your business.

Looking for the best resilience platforms? Contact us right now to learn more about our technology consulting solutions.

Published on
February 28, 2024

Industry insights you won’t delete. Delivered to your inbox weekly.

Other posts