This topic guides you on how to plan your setup for disaster recovery accessible only from the corporate intranet.
How to plan for disaster recovery site accessible over the corporate intranet
Watch the following video to know more about the disaster recovery site accessible over the corporate intranet scenario.
Review the following tasks for planning your disaster recovery site.
|Design of a disaster recovery site accessible over the corporate intranet||Understand the design of a disaster recovery site accessible over the corporate intranet.|
|AWS components required||Understand the AWS components required for setting up a disaster recovery site.|
|Failover flow||Understand the failover flow and how disaster recovery happens.|
|Preparing the disaster recovery site|
|Step1: DRaaS support matrix||Understand the operating systems, AWS storage regions that Phoenix supports and all the prerequisites and limitations.|
|Step 2: Create a VPC for AWS proxy deployment||Understand how to create a VPC used to deploy AWS proxy orchestrating the Phoenix DRaaS solution.|
|Step 3: Deploy the Phoenix AWS proxy||Understand how to deploy a Phoenix AWS proxy in the Phoenix Cloud account.|
|Step 4: Create the disaster recovery site||Understand the prerequisites and the process of creating a disaster recovery site.|
|Proceeding with disaster recovery|
|Create a disaster recovery plan||Understand how to create a disaster recovery plan for virtual machines.|
Design of a disaster recovery site accessible over the corporate intranet
In this scenario, the Phoenix DRaaS solution is meant for a data center inside the corporate intranet. If the data center fails, servers are restored in the AWS cloud using the Phoenix DRaaS solution and both servers and users don't experience service interruption. A key assumption of this design is that the replicated servers are accessible only from the intranet through a VPN connection between the intranet and AWS cloud.
AWS components required
The following AWS components are required for setting up a disaster recovery site accessible over the intranet.
|Two AWS accounts||Druva’s account containing the Phoenix Cloud and the customer’s account.|
|VPCs||The first building block of the disaster recovery site infrastructure is a VPC. The failover site is created in a separate VPC from Phoenix AWS proxy to isolate the proxies from the failover servers. It also separates the disaster recovery site from the intranet.|
|Internet gateway||Phoenix AWS proxy requires access to the intranet. The proxy VPC has an Internet gateway attached but the disaster recovery site only needs the intranet access. The connection to the intranet is implemented using an encrypted VPN connection|
|Customer gateway and VPN gateway||Customer gateway tunnel and VPN gateway are the AWS components that are created by the CloudFormation template. Note that you also need to configure the on-premise router, firewall rules, and other intranet routers to properly route traffic to the disaster recovery site.|
|Subnet||DR servers need to be deployed in a subnet, so a subnet needs to be added to the VPC. This would be a private subnet because it does not need internet access. A subnet can be present only in one availability zone. Thus, it cannot be geographically distributed. This creates a single point of failure and that's why the design includes a second subnet in another availability zone.|
The following diagram illustrates how the components are deployed.
The following diagram illustrates the failover workflow.
- The failover flow is triggered by an administrator from the Phoenix Management Console, which instructs the Phoenix AWS proxy to start the failover process.
- The Phoenix AWS proxy creates an EBS volume from the VM snapshot stored in the S3 bucket and at the same time it launches an EC2 instance from the Druva AMI.
- Next, it mounts the EBS volume to the EC2 instance and our software running on the EC2 instance adopts the restored operating system to the AWS hardware.
- The EC2 instance reports the progress to Phoenix through the AWS simple queuing service (SQS). It sends updates to the SQS queue, which is pulled by the Phoenix AWS proxy and reported to the Phoenix Management Console.
How is this communication happening, you might ask?
- The SQS service is outside the user’s account and it's available via intranet. Since the VPC has no internet connectivity, the only possible connection would be through the VPN tunnel and then through the corporate intranet. It may or may not work depending on the firewall settings of the routers. So to avoid this dependency, the design includes an SQS endpoint, which is deployed inside the VPC. The SQS endpoint allows for the SQS traffic from a private subnet to reach the SQS service over the AWS network. There needs to be one endpoint per availability zone, so that's why there are two endpoints in the template.
- During the failover, the EC2 instance stores metadata into an S3 bucket in the customer account. The S3 service is also a public service accessible over the Internet. The S3 endpoint deployed inside the failover VPC ensures that the EC2 instance can reach the S3 bucket over the AWS network instead of the Internet.
- The same flow applies to all the servers. However, the users on the corporate intranet wouldn’t be able to reach the servers without one additional step - DNS update.
- DNS names of the restored servers are updated with the new IP addresses for the disaster recovery site.
An important requirement for the disaster recovery site design is the ability to test DR failover. We recommend that you have separate test subnets, where you can run the test failover to check if everything works as expected. VPC deployment requires quite a few other components such as routing tables, access lists, and security groups.