What Is It, Why Do You Need It, and More.

Break free from lockdowns with an effective disaster recovery plan!

The ensuing lockdowns following the COVID-19 pandemic have highlighted the importance of business continuity and disaster recovery (BCDR) initiatives. If a proper BCDR strategy is absent, companies take too long to get back into action. Worse, some even struggle to recover at all. 

BCDR is a very broad topic and I want to get a little deeper in this post, so I’m going to zoom into one major component of BCDR; namely, disaster recovery (DR). I’ll cover the salient points of effective disaster recovery initiatives, from the elements of disaster recovery to the types of DR. I’ll also give some tips you can use when creating a disaster recovery plan. 

At the heart of every disaster recovery initiative is a disaster recovery plan, so let’s define what that is first. 

What Is a Disaster Recovery Plan?

A disaster recovery (DR) plan is a documented plan that details what a company must do to recover from a disaster and resume operations. It involves, among several things:

  • Designating members of the disaster recovery team
  • Performing risk analysis
  • Identifying business-critical assets
  • Outlining the backup plan
  • Testing and optimizing the DR plan

We’ll go over the details of these items later. First, let’s get a common source of confusion out of the way. 

The Main Difference Between Business Continuity and Disaster Recovery

People often get confused with the difference between business continuity and disaster recovery. While the two are closely intertwined, they’re not the same. The question now: what is business continuity, and what is disaster recovery

Business continuity is concerned with the overall capability of a company to continue doing business after and even during a disaster or minor disruption. The main goal of business continuity is to keep the business running as close to normal operations as possible at all times. That means it may include efforts to prevent business operations from suffering any amount of downtime in the first place.

Disaster recovery, on the other hand, is one part (albeit a major one) of business continuity. It’s more focused on the possibility of the business suffering downtime and how to quickly recover from it.

Thus, what are the benefits of implementing disaster recovery?

4 Major Benefits of Effective Disaster Recovery

Anyone could just perform regular data backups and call it disaster recovery. Unfortunately, backups are just one component of an effective disaster recovery initiative. In the next section, I’ll discuss several other elements. In the meantime, let’s discuss the 4 major benefits of striving for effective disaster recovery.

1. Reduce Downtimes

Downtimes, unplanned ones to be exact, refer to when some or all of your business processes are unavailable due to unforeseen events. Whenever this happens, you’ll lose productivity, revenue, and—in severe cases—customer satisfaction. Once you can bring down the frequency and length of downtimes, you’ll unlock other benefits. 

2. Limit Potential Losses

Reduced downtimes, which as we mentioned is an offshoot of effective disaster recovery, can limit potential losses. You don’t have to lose a week’s worth of revenue and business opportunities though. You could just incur losses for a day or just a few hours. That’s a huge difference from a financial standpoint. Thus, you’ll save yourself lots of money!

3. Avoid Reputational Damage

The longer an outage lasts, the more it will impact your business processes. It’ll also impact the products/services of customers depending on these processes. Any extended outage can force customers to look for other options. Worse, if word gets out that you’re unable to recover quickly, that can turn off potential customers as well. Thus, if you can bring down outages through effective disaster recovery, you can avoid reputational damage.

4. Avert Lawsuits

Many companies that incur losses due to your outages, won’t take those losses on their books. They’ll take legal action instead. That’s why if you can limit or prevent lengthy outages, you’ll avoid messy, and costly, court battles. 

Now that you’re in a better position to answer the question “What is disaster recovery?”, it’s time for you to be familiar with the key elements of DR. 

5 Key Elements of Disaster Recovery

Disaster recovery initiatives involve several elements, but I consider these, arranged in no particular order, the top 5:

1. Disaster Recovery Team

Your disaster recovery team (DRT) is the one responsible for putting together, testing, and updating your disaster recovery plan. The team will also execute it if the need arises. As much as possible, assign members representing various departments to your DR team. For example, you may let your CFO lead the group and assign representatives from your IT, legal, accounting, and communications departments. 

This is to ensure the team can take into account all possible risks surrounding your company. It’s also to confirm the team can develop a comprehensive plan that benefits everyone. Ensure each member has clearly defined roles and involve them in risk analysis (see next subsection) before the development of the DR plan.

2. Risk Assessment

Every company is unique in terms of the risks it’s exposed to and the impact of those risks on its assets and operations. A risk assessment, and a corresponding business impact analysis that comes after an assessment, will help you identify both. It’ll also help you prioritize as you devise your plan and keep it within budget. For instance, you shouldn’t prioritize provisions for flooding if you’re in an area that hasn’t experienced a serious flood in the last 100 years. 

Involve all members of your DR plan in risk assessment activities and make it an ongoing practice. This will enable you to identify current and emerging risks across all sectors of your company. It also ensures your DR plan is up-to-date.

Rows of computer-bearing desks in a flooded office. Some of the seats, paper, and a printer, are lying on the floor, partially submerged in water.
Flooded office? I hope you have a disaster recovery plan in place

3. Inventory of Business-Critical Assets

One important bi-product of a risk assessment and business impact analysis is to identify your business-critical assets. You can determine which applications, parts of IT infrastructure, data, etc. you’ll need to recover at the soonest time possible. 

Come up with an inventory of your business-critical assets and make them the top priority in your disaster recovery plan. Wherever appropriate, identify each asset’s recovery time objective (RTO) and recovery point objective (RPO). See FAQ for details. 

4. Backup Plan

Backups are the number one prerequisite of the recovery phase of every DR initiative. You can’t recover if you have nothing to recover from. Normally, backups apply to data. That said, you can also back up virtual machines and virtual desktops if, for example, you use virtualization in your IT infrastructure. To make backups as efficient and effective as possible, you should lay out a backup plan.

A backup plan would typically involve provisions for

  • What you need to backup (e.g., you should give priority to your business-critical data and virtual machines), 
  • Where you should back them up (e.g., tape, offsite backup, cloud backups)
  • What RTOs and RPOs are associated with those backups

5. Disaster Recovery Testing and Optimization

You can’t determine the effectiveness of your disaster recovery plan unless you put it to the test. Testing can help you identify flaws in your plan and in its implementation. For example, you might expect to recover from an offsite backup. That said, after testing, you realize certain databases aren’t backing up to that offsite backup system.

Testing can also help you identify provisions in your plan that don’t apply any more and need updating. Risks evolve, so your DR plan should also change accordingly.  

Now that you know the most important elements of disaster recovery, let’s see the products and services that can help you implement DR. 

6 Types of disaster recovery solutions

In this section, I’ll discuss the 6 types of disaster recovery solutions you’ll find in the market. 

Please note that these aren’t complete DR solutions that single-handedly address your entire DR needs. Rather, these are point solutions that address certain challenges or needs in disaster recovery. 

1. Back-Up Solutions

As you might have guessed, these are solutions that enable you to perform backups for your data and/or virtualized applications and desktops. In general, backups are divided into 3 types:

  1. Full backups – Make copies of your original data and store them in a separate storage device
  2. Incremental backups – Copy only the portions that have changed since the last backup
  3. Differential backups – Creates copies of portions that have changed since the last full backup

In a perfect world, you’d want to perform full backups all the time. Unfortunately, we don’t live in a perfect world. Due to the amount of data you have to back up, full backups consume a lot of network bandwidth. If you need to perform backups during office hours, choose incremental or differential backups. 

2. Cold Site

If you can afford it, it’s best to have a dedicated disaster recovery site to house your alternate IT infrastructure. That’s in case a disaster incapacitates your main site. This DR site should be offsite, i.e., in a separate geographical location. The most basic DR site is called a cold site. It contains electricity and physical space to house your backup IT infrastructure. That said, it doesn’t have the IT infrastructure itself. 

You still need to set up your IT equipment, servers, operating systems, applications, data, etc. Setting up an IT environment requires a great deal of time, so you shouldn’t use a cold site for mission-critical assets. 

A graphic image of a white-colored, partially-empty room with a few black cabinets. Rows of rectangular-shaped lights are embedded flush on the ceiling, and white-colored tiles make up the floor. A red fire extinguisher hanging by the only visible pillar stands out in a far corner of the room.
A cold site for disaster recovery.

3. Hot Site

Unlike a cold site, a hot site has all the necessary equipment to support your business operations. Well, at least, it includes the necessary equipment on the IT side of things. Your employees can simply walk into your hot site, and they’ll be ready to continue business as usual right away.

Of course, a hot site is much more expensive than a cold site. Unless you have a really big budget, you should only use a hot site for mission-critical assets.

4. Disaster Recovery as a Service (DRaaS)

A Disaster Recovery as a Service (DRaaS) is a third-party cloud disaster recovery service. It enables you to replicate certain portions of your IT infrastructure (data, virtual machines, applications, desktops, etc.) in the DRaaS provider’s cloud infrastructure. 

One thing you should remember about DRaaS is that, if the DRaaS provider’s data center is located far from where you operate, you’ll experience latency. This means your applications will be slow to respond. Choosing a DRaaS provider that’s in the same location as your main site isn’t good either. A disaster that takes down your operations will affect it. If you do decide to go with a DRaaS solution, ensure to test for latency. 

5. Backup as a Service

Backup as a Service (BaaS) is also a cloud-based service with almost the same functions as cloud disaster recovery or DRaaS. Some BaaS and DRaaS offerings even have very similar features, like protecting your business-critical data and minimizing the risk of data loss.

In general, BaaS providers commit to less stringent service level agreements (SLAs) compared to DRaaS providers. Thus, you’d normally pick BaaS if your business continuity requirements aren’t very high. For instance, you may use BaaS for non-mission-critical workloads. 

6. Virtualization

Virtualization can play a significant role in improving disaster recovery capabilities. For example, if you virtualize your applications, desktops, and data, and deliver them through VDI, you can simply backup your entire VDI environment. When a disaster strikes, you can then have your users connect to that backup VDI environment from their devices.

You’ll get 2 major advantages here: 

  1. Virtualized environments are much cheaper to deploy, backup, and maintain than their physical counterparts. You only deal with software and files, not replicas of your physical hardware. 
  2. Backups of virtual applications and desktops are better than just data backups. If you only backup data, you’ll still need to install applications and operating systems to get back into the action. Virtual applications and desktops delivered via VDI don’t require any installation. 

The one downside? Virtualization is a completely different beast. It requires people with the appropriate skillset. If you don’t have in-house talent who can handle your virtualization environment, this option might not be for you.  

This list of solutions should be enough to get you started. Speaking of getting started, let me now give you an overview of the steps needed to develop and maintain a disaster recovery plan.

9 Tips to Create and Maintain a Disaster Recovery Plan

Your disaster recovery plan is the heart and soul of your DR program. It provides a blueprint and documented guidance for all members of your company to follow. In the succeeding sections, I’ll provide tips on how to create and maintain your disaster recovery plan. 

1. Identify Threats

To start with, you should identify the threats that are likely to impact your business. Include all sorts of threats, e.g., natural calamities (earthquakes, tsunamis, hurricanes, floods, wildfire, etc.) or cyber threats (DDoS, ransomware, data breach, etc.). Once you have a list, rank each threat’s likelihood to occur and its corresponding business impact, each on a scale of 1 to 5. 

You should then plot their values in a risk matrix similar to the one shown below. Each threat’s corresponding value on the chart is its risk rating. This is one way of ranking threats. The higher a threat’s risk rating, the higher its priority should be as you factor in threats when formulating your disaster recovery plan.

A matrix showing risk ratings. The x-axis represents likelihood, while the y-axes represents impact. Both axes number from 1 to 5. Where each axes intersects in the chart, values are calculated by multiplying the value of likelihood with the value of impact. The tiles are colored green, yellow, orange, and red. With green denoting the least risky values and red denoting the most risky value.
Calculate your threats’ risk ratings using this chart.

2. Outline Your Emergency Response Protocol

Members of your company, especially your Emergency Response Team (ERT), should know the specific triggers to activate your DR plan. More importantly, they should know what to do once those triggers are breached. Possible triggers may include complete loss of communications or power, flooding inside your building, etc. Provide all employees with the contact numbers of your ERT. That way they’ll know who to call if they’re the first to notice a brewing emergency.

Once those triggers are breached, your ERT must notify other members of your company, contact emergency services, and determine which portions of your DR plan to put into play. If your ERT is separate from your DR team, your ERT must contact your DR team. They can then proceed with disaster recovery activities. You must specify all these in your DR plan.

3. Identify Members of Your Recovery Team

This is different from the DR team we talked about earlier. Your recovery teams will be responsible for performing recoveries in various areas of your IT infrastructure. This requires different technical skill sets. For example, you need a team for network operations recovery, another for server recovery, yet another for application recovery, and so on. 

You should include these individuals and their contact numbers in your DR plan document. Remember, though, your company will undergo personnel changes now and then. That’s why when you update your DR plan, ensure you include updates to recovery team memberships as well

4. Identify Members of Your Media Relations Team

Your media relations team will be in charge of developing guidelines for crafting appropriate messaging. You’ll use these messages for public disclosures during or after a disaster. If you already have a communications department, you should pick members from there. Ensure only members of this team have the authority to issue statements on electronic (TV, Web, radio) and print media. Again, include a list of these individuals and their contact information in your DR plan. 

5. Incorporate Insurance-Related Information

A key part of disaster preparedness is ensuring you have sufficient insurance coverage for the potential costs of recovering from a disaster. Your company will likely have invested in various liability insurance policies. These policies could be errors & omissions (E&O), directors & officers (D&O), and general liability, among others.  

Include relevant information about these policies in your DR plan. Have a section where you list policy names and their corresponding coverage types, coverage periods, amount of coverage, etc. Specify the contact person for any insurance-related issues. 

6. Incorporate Financial and Legal Issues

Your DR plan should consider financial and legal issues that may stem from a disaster. Depending on the extent of the disaster, you’d likely face loss of revenue, loss of cash, or theft of valuable items/equipment. You may also face loss of financial documents, a tight cash flow, legal actions or claims, etc.

Your disaster recovery team should be able to provide an initial assessment of the financial impact as well as possible legal obligations. It should also be able to address initial needs relevant to these two sets of issues. That’s the reason why you should include people from finance and legal in your DRT. 

7. Produce and Distribute Documentation

You must produce electronic and hard copies of your disaster recovery plan. More importantly, you should distribute these copies to concerned parties. These could be members of your emergency response team, disaster recovery team, and senior management. 

You should also instruct them all to store copies of these documents onsite and in their respective homes. Regardless of how many copies you produce, you should also store a master copy in a specific location

8. Test Your Disaster Recovery Plan 

When you run tests on your DR plan, check to see if the results align with your RTOs and RPOs. Your business processes and various elements of those processes should be able to recover within their prescribed RTOs. 

As for RPOs, you can check relevant logs to verify that data is being backed up within the prescribed RPOs. See, for example, if data is being transmitted to a designated backup system or site within RPO. 

9. Update Your Disaster Recovery Plan

Consider your DR plan a living document. That means you have to update it from time to time. An outdated document can confuse or, worse, create costly business-impacting missteps. 

You’re supposed to provide copies to members of your ERT, DRT, and senior management. That’s why you should ensure those individuals have the latest version at all times. 

Final Words

In conclusion, effective disaster recovery (DR) requires a good DR team, a well-thought-out plan, the right choice of solutions, and more. You now know how to distinguish disaster recovery from business continuity. You also have a clear idea about the 5 key elements of disaster recovery, some types of disaster recovery solutions, and tips to create a DR plan. 

I hope you gained enough information to jump-start your disaster recovery initiatives. You can then eliminate the risk of suffering extended or permanent downtime after a disaster. If you have more questions, check out the FAQ and Resources sections below.

FAQ

What is RTO?

The recovery time objective, RTO, is the maximum period over which a mission-critical asset can be unavailable due to failure or disruption. For example, these assets could be applications, servers, data network devices, etc. In your disaster recovery plan, you should aim to make that asset available again within the prescribed RPO. 

What is RPO?

The recovery point objective, RPO, normally applies to data—specifically, backups of that data. It’s the period over which you must make a backup of a piece of data. For example, if a record gets updated once a week, an RPO of 12 hours may be acceptable. That said if a record gets updated every 30 minutes, that 12-hour RPO won’t suffice. A 30-minute RPO or less would be better. In your DR plan, strive to implement backups as often or at a higher frequency than your critical data’s RPO. 

What is Errors & Omissions insurance?

Errors & Omissions (E&O) insurance is a type of liability insurance that protects against claims of inadequate work or negligent actions. Let’s say a client suffers financial losses as a result of a disaster that impacted your business and decides to sue. E&O insurance can help you cover court and settlement costs. 

What is Directors & Officers insurance?

Just like E&O, Directors and Officers (D&O) insurance is also a type of liability insurance that protects against allegations of negligent practices and substandard work. That said, it’s more tailored for company directors, board members, and senior managers. Should a client suffer financial losses due to a disaster that impacted your business and decides to sue, D&O can help you pay legal fees and compensation claims. 

What is the communications department?

You shouldn’t confuse this term with communications technology. The communications department, as used in the context of this article, refers to the department responsible for media and public relations, customer communications and marketing, internal communications, and crisis communications. Aside from other ‘communications’ roles, this department handles coming up with appropriate messaging as well as conveying that message through various media.

Resources

TechGenix: Newsletters

Subscribe to our newsletters for more quality content.

TechGenix: Article on Cloud Disaster Recovery Options

Discover various options for cloud disaster recovery solutions. 

TechGenix: Article on DRaaS Options

Examine these options for DRaaS.

TechGenix: Guide on How to Protect Data Backups

Learn about the risks and what you need to do to protect data backups.

TechGenix: Guide on How to Survive IT emergencies

Enhance your capabilities for surviving IT emergencies.

TechGenix: Article on Network Security and Its Business Benefits

Take a deep dive into the essential elements of network security.