How to Mitigate Risk in Hybrid IT – and Why We Must

At Gartner’s recent Infrastructure, Operations and Cloud Strategies Summit, there were 125 sessions with on cloud capability, the edge, improvements in network capability and management, and deeply technical sessions on artificial intelligence and quantum computing.  Amidst all of this, analyst Jerry Roseman opened a session entitled Tips to Enhance Your IT Disaster Recovery Program by asking “the real question:”

Are you really able to recover – not – “do you have an environment?”

Gartner’s 2019 Security and Risk Management Survey tells us that

  • 76% of respondents reported at least one incident requiring the activation of a DR plan
  • 50% of respondents had two or more such incidents
  • 33% had serious issues recovering
  • Only 12% recovered with the right expectations.

As infrastructure consultants, we appreciate the way in which Jerry grounded this session in timeless themes updated for today’s cloud capability. His main points (with some of our notes):

Business alignment to IT DR is a challenge

  • Analyze business functions: the business impact analysis is essential.
  • Plan for loss categories- internal infrastructure, regional issues, data corruption, security issue
  • Standardize your criticality tiers, and share them with the business
  • Standardize your recovery strategies

GTSG notes:  one of our favorite engagements, a number of years back, was for a large brokerage firm who asked us to validate that the Recovery Time Objectives (RTOs) for their mission-critical applications were supported by the RTOs of the applications, databases and services on which they depend. GTSG performed the analysis and identified

  • 10 major applications with a 1-Hour RTO, and
  • 3 major applications with 4-Hour RTOs

that were necessary for 5-Minute RTO Business Activities.

Missing details limit successful DR execution

  • Ensure understanding of roles and responsibilities (a RACI or similar construct)
  • Plan for unavailability of key members
  • Enhance access control
  • Detail procedures

GTSG notes: another engagement performed by our GTSG consultants involved DR plans rendered obsolete by extensive data center consolidation. The remediation required the update of Application Recovery Designs, followed by detailed Application Recovery Procedures. Only then could this board-level audit exposure be closed.

Limited exercising masks executive insight

We must exercise, not simply test, varying the scope and depth of the tests, with updates to processes and review of lessons learned.

Finally, Jerry reminds us that this is a multi-year effort, with a moving target based on changes both with workloads and with the available recovery technology.

Planning- and exercising- for effective recovery is a passion at GTSG for which Gartner has recognized us. Please reach out to to talk further. We look forward to it.