In the current article, I’m introducing Disaster Recovery and planning.
Disaster Recovery and planning concepts (DR) is a critical plan that implements backup services to ensure continuity of business services provided by the data center.
in case of any disaster incident that might occur in the primary data center site. Disasters range from natural catastrophes such as earthquakes to unplanned system crashes.
Minor problems such as system crashes, single physical hardware failures, connectivity disruptions are not considered disasters.
As recovering from failures is not too much costly, while a fire case in a data center is considered a disaster.
If the cost of recovery relative to the cost of damages is very high, the issue is considered a disaster.
An example is a fire case in a data center that damages 10 physical servers. The cost of purchasing new servers and implementing them in the data center is very high.
On the other hand, losing one router in a data center is not considered a disaster because of the low price of the new router relative to the solution of recovery.
It is a matter of cost only, and what is considered a disaster to some companies might be a minor loss to another based on their financials and budgets provisioned to IT infrastructure.
It is also relative to service and business nature.
Disruption of Email service in a company where its business depends mainly on emails is considered a disaster.
Such companies do not accept email service to be down for a long time because each minute is translated to money and one minute of email disruption may cause the loss of millions of dollars.
Other companies -relative to their business needs- may sustain email service to be down for more than one day without any financial losses, in that case, losing email service is not a disaster, but a service failure.
If the time required to recover from a failure is long, the case is considered a (disaster).
Generally speaking, but not a rule, losing service for more than 3 days is considered a disaster
(That is the time required to bring the service up not time passed since it had gone down).
In certain cases, one hour recovery time is enough to shift a case from Service Failure zone to Disaster zone. Again, it is a matter of business nature.
A look at the following table explains the differences between three common types of problems: Disaster, Service Disruption (failure), and Minor Problem, accompanied by business continuity solutions.
Differences between three common types of problems: Disaster, Service Disruption (failure), and Minor Problem, accompanied with business continuity solutions.
Planning Disaster Recovery is a critical task and is the foundation of a successful business.
Infrastructure admins must plan the Disaster Recovery Site (DR Site) during the phase of planning infrastructure and datacenters not after.
The procedure is not complementary to main site planning but parallel to it. Adding DR Sites to the existing datacenter primary site is a complex and costly task.
Here are 4 steps to follow when you plan DR Site during the Primary Site planning phase.
I recommend you create the following table during the planning phase of your primary / main data center based on your organization’s business nature and budget provisioned to IT Dept.
Consider two scales for both Downtime Cost and Replication Bandwidth, which is based on your business. I put some examples in the table for better understanding.
Downtime Cost and Replication Bandwidth examples.
I put some examples in the table for better understanding.
Based on Step 1, plan Primary sites that required logical and physical infrastructure and define resources.
It is very important to estimate the average workload by each service.
You can use the following table to note down workloads.
table to note down workloads.
Based on information defined in Step 1 start planning logical infrastructure for the DR Site, employing suggested solutions defined in “Disaster Solution”, and consider use DR Site for HA under certain conditions.
Based on estimated average workloads from Step 2, Maximum Allowable Downtime information from Step 1, and DR Site Logical infrastructure planned in Step 3; define hardware required to build DR Site.
In this phase try to minimize required hardware as much as possible, to decrease initial setup cost, maintenance efforts, and complexity.
You can benefit from old hardware marked for upgrades like Servers that have to be replaced and outdated Router or Switches.
Keep in mind that DR Site is very critical, and at the same time it may not carry real workloads over its life span.
Frankly, if the infrastructure administrator plans Primary Site perfectly, DR Site will not be active except in case of a natural catastrophe or during scheduled simulations.
Be very careful and invest enough time planning the DR site.
The steps are summarized in the flowchart.
Disaster Recovery and Planning site flowchart
You can also learn more about Digital Transformation in business like:
CI/CD for Google Cloud Functions in a mono repo. the solution depends on GitHub actions, which is a tool for CI/CD automation with broad spectrum of flexibility
أيا كانت طبيعة عملك أو حجمه ستهتم بمعرفة ما يتضمنه المقال من شرح وتفصيل لكل ما يتعلق بالفاتورة الالكترونية
عندما تفكر في نشاطك كتسويق خدمتك البرمجية كمنتج فأدوات التسويق الالكتروني أثبتت كفاءة عالية في الحصول علي صفقات كاملة ومربحة
© 2021 Aten Technologies for Enterprise Content - All rights reserved.