Skip to main content
Druva Documentation

How disaster recovery works

Overview

The Druva Phoenix DRaaS functionality enables you to recover and spin up the configured virtual machines in the Amazon Web Services (AWS) public cloud. In case of a disaster, Phoenix DRaaS ensures business continuity without the need for additional dedicated on-premise software, storage or hardware, which in turn ensures a significant reduction in cost and improvement in agility. Phoenix DRaaS is much more cost-effective than the traditional DR solution since it switches to the Pay-as-You-Go model. To know more about the benefits of Phoenix DRaaS, see About Phoenix disaster recovery as a service

Phoenix DRaaS is available on-demand in the cloud. This document will give you a high-level overview of the Phoenix DRaaS solution, help you understand the pieces that go into setting up your disaster recovery by using Phoenix, define your BC/DR strategy and share our recommendations and best practices with you.

Watch the following video to know more about Phoenix DRaaS solution.

How DRaaS works

Druva’s DRaaS solution takes advantage of the backups of the data center to the Phoenix Cloud. The DR Backup site is deployed in the customer’s AWS account and since Phoenix is also deployed on AWS, data transfer between the accounts in the same region is seamless.

First, the latest backup snapshot is copied into your AWS account and stored as EBS snapshots. The temporary EBS volume created for data transfer is deleted after taking a successful snapshot.

The same process is repeated for all VMDK disks attached to the protected VMs. 

But how is it happening, you might ask? The entity executing all these tasks is the Phoenix AWS proxy. The Phoenix AWS proxy moves the data from Druva S3 to the customer’s account but to make this transition happen, the EC2 instance running the Phoenix AWS proxy needs a set of permissions. These permissions are defined in the Druva role and policy added to the customer account during proxy deployment. 

 In a nutshell, this process is the data path from the primary data center to the backup site in the cloud. The frequency of the updates is configurable in Phoenix and it should be set according to the required RPO.

For more information on DR plans and creating DR plans, see Disaster Recovery Plan.

Restore workflow

Let's discuss the DR restore workflow flow briefly and know what is happening during the process of copying data from Druva account to the customer account.

DR_overview.jpg

  1. The Phoenix Cloud tells the Phoenix AWS proxy to copy the virtual machine backup data from S3 bucket in Phoenix.
  2. The Phoenix AWS proxy then makes an API call to execute this command. 
  3. The data from S3 goes through the Phoenix AWS proxy to the EBS volume and the data path for this Phoenix AWS proxy is the Internet gateway.
    We introduce one more entity here, which is S3 endpoint. It ensures that the data coming from one account to another does not leave the AWS network.
  4. Once the data has been copied to the EBS volume, then through another API call, the Phoenix AWS proxy creates an EBS snapshot of the volume and deletes the volume itself.

Failover workflow

To ask what a failover is, is to ask what happens when disaster strikes and the primary data center becomes unavailable. Here’s what happens.

DR_failover.jpg

  1. First, an administrator triggers the failover from the Phoenix Management Console and Phoenix Cloud instructs the Phoenix AWS proxy to initiate the failover.
  2. The EBS snapshot is used to create an EBS volume. The Phoenix AWS proxy creates an EC2 instance in your AWS account. You can choose the EC2 instance type from the Phoenix Management Console when you configure the VM for DR. The EBS volume gets attached to the EC2 instance.
  3. Once the process has completed, EC2 instance is restarted and is ready to support its workload. The same process is repeated for all protected VMs. 
  4. The last step is to update the DNS servers to redirect the traffic to the IP addresses of the new EC2 servers.

Phoenix provides advanced orchestration capabilities allowing to decide on the failover sequence, creating dependencies between EC2s and adding scripts for execution during EC2 boot. An example of dependency would be an e-commerce Web server storing data in the database. It wouldn’t make sense to start Web EC2 before the database is available.

Druva DRaaS also allows for DR testing in a different VPC or subnet. 

Failback workflow

The starting point for failback is a freshly rebuilt primary data center and backup DR site containing data which needs to be transferred back to the primary site.

DR_failback.jpg

  1. The first step is to redeploy the Phoenix backup proxy VM to establish communication with the Phoenix Cloud and get the backup configuration from the cloud. 
  2. In the next step, the backup proxy launches a template VM, which reaches out to the EC2 instance and get its configuration information – number of CPUs, memory size, number of disks, and so on. 
  3. Based on this information received, it attaches one or more VMDK disks and starts copying data from the EBS volumes to the VMDK disks. 
  4. Once the download is completed, the template VM is rebooted with the configuration parameters read from the EC2 instance. After that, the server is operational. The same process is repeated for all protected VMs. 
  5. The last step is to update the DNS to redirect the traffic back to the primary site.
  • Was this article helpful?