Skip to main content

How can we help you?

Druva Documentation

Best Practices for Phoenix Disaster Recovery as a Service

Phoenix Editions: File:/cross.pngBusiness         File:/tick.png Enterprise     File:/tick.pngElite
(Purchase Separately)

This article describes the best practices that must be following while using Phoenix Disaster Recovery As a Service (DraaS).

Table of contents

Phoenix AWS Proxy

Phoenix AWS proxy, also referred to as DR proxy, is an EC2 instance that runs in the customer’s AWS account. The Phoenix AWS proxy runs the Phoenix Disaster Recovery service and is responsible for orchestrating the DR Restore, DR failback, and DR failover. The DR proxy is deployed using the AWS CloudFormation template. The DR proxy deployment takes less than 10 minutes.

  • Druva recommends that you deploy at least two DR proxies in separate availability zones for high availability.

Note: Each DR proxy can run three DR restore jobs concurrently.

  • The recommended EC2 instance size for the Phoenix AWS proxy is c5.2xlarge.
Instance type vCPU Memory(GiB) Instance Storage(GiB) Network Bandwidth (Gbps) EBS Bandwidth( Mbps)
c5.2xlarge 8 16 EBS-Only Upto 10 Upto 4,750
  • The DR proxy must have access to the following services:

    • S3,
    • EC2-API, and
    • SQS

Druva CloudFormation template creates endpoints that provide connectivity to these services over AWS private network.

AWS services.png

  •  Ensure that the EC2 key pair assigned to the Phoenix AWS Proxy is stored in a secure location. The key pair is used to access the Phoenix AWS Proxy for troubleshooting only.    

VPC

While defining networking mappings in a DR plan, we need you to map the vCenter source network to a VPC and subnet on the target AWS account.

  • If you create a new Amazon VPC, you  don’t need to attach an Internet Gateway(IGW) to it, as the Phoenix AWS Proxy  uses the AWS private link for all communication.
  • Ensure that DNS hostnames and DNS resolution are enabled within the VPC.

    AWS DNS hostname and resolution.png

DR prerequisite checks    

DR prerequisite checks run while the VM backup is in progress and ensures that the VM meets all the DR failover and failback requirements. Ensure that all the prerequisite checks are successful for a successful DR failback or failover.

When the DR prerequisites checks do not execute

The DR prerequisite checks may not execute at all for one or more of the following reasons:

  • The VMware Backup proxy is unable to communicate with the ESX host on port 443. Enable communication between the backup proxy and the ESX host on port 443.
  • The VMware Backup proxy is on a version older than 4.8.11. Upgrade the VMware backup proxy to the latest version, and ensure that the first VM backup after the proxy upgrade is successful.
  • The DR prerequisite checks may not execute at all if the VM cannot connect to Druva download portal at https://downloads.druva.com/phoenix/  to download the prerequisite check executables while the VM backup is in progress. If the VM is unable to connect to Druva download portal and download the prerequisite check executables, ensure that:

Exclude the DR prerequisite check executables from any antivirus software running on the VM. The following table lists the DR prerequisite check executables that must be excluded depending upon the VM operating system.
 

Operating system Prerequisite check executable
Windows PhoenixPreflight_<version number>.exe
Linux PhoenixPreflight_<version number>
Resolving prerequisite check errors

If the prerequisites checks fail or pass with warnings, resolve the errors or warnings before re-running the backup job. 

Resolve Linux prerequisite check errors

Resolve Windows prerequisite check errors

  1. Credentials: Ensure that the VMs whose disaster recovery you want to perform have credentials assigned to them. If credentials are not assigned to virtual machines or are invalid, Phoenix will not perform prerequisite checks. You can either assign credentials to the VMs from the VMware page, or the Disaster Recovery page.
    The user account must have the following privileges:

Windows virtual machines

  • The account must have local administrative privileges.
  • UAC must be disabled on the virtual machine. See disabling UAC on Windows server for more information.

Linux virtual machines

  • A non-root user must have sudo rights and must have the NOPASSWD: ALL tag enabled in the sudoers file. Edit the sudoers file and ensure that the non-root user has the following entry at the end:

username ALL=(ALL) NOPASSWD: ALL

Where username, is the username that can execute all commands without prompting for a password.

Verifying permissions

Login to the Linux machine using the user account that needs to be tested.

Execute the sudo -l command. If the user has sudo privileges and the NOPASSWD: ALL tag has been enabled in the sudoers file, the command will generate the following output without prompting for a password.
Sudo privilege with nopassword.png
If the user does not have sudo privileges or does not have the NOPASSWD: ALL tag enabled in the sudoers file, the command will generate the following output and will prompt for a password.
No sudo privilege.png

  • The directory /home/{username} must exist, and the non-root user must have read, write, and execute (RWX) permissions over this directory.

While a VM backup is in progress, the prerequisite checks use the working directory /home/{username}/Druva/Phoenix/Preflight for non-root users and the directory /home/{PreflightBinaryName}/Druva/Phoenix/Preflight for root users. Once the prerequisite checks are complete, Phoenix deletes the directories that it created under /home/{username} for non-root users or /home/{PreflightBinaryName} for root users.

  1. Virtual Machines
    1. The VM must be running for the prerequisite check to work.
    2. The VM must have VMware tools installed on it.
    3. The VM must have at least 1 GB of free space on the boot partition.
    4. Ensure that all Druva processes are whitelisted in any antivirus software running on the virtual machine.

Here are all the 14 Windows files that must be whitelisted:

C:\Windows\System32\systeminfo.exe
C:\Druva\Vmtools\Ec2Install\Ec2Install.exe
C:\Druva\Vmtools\Citrix_xensetup.exe
C:\Druva\Vmtools\dotnetfx45.exe
C:\Druva\Vmtools\AWSPVDriverSetup8.2.1.msi
C:\Druva\Vmtools\dotNetFx40_Full_x86_x64.exe
C:\Druva\Vmtools\Ec2Install\AmazonSSMAgentSetup.exe
C:\Druva\Vmtools\XenGuestAgent.exe
C:\Druva\Vmtools\wic_x86_enu.exe
C:\Druva\Vmtools\wic_x64_enu.exe
C:\Druva\Vmtools\WiXEC2ConfigSetup_64.msi
C:\Druva\Model\cli.exe
C:\Druva\Model\run_model.bat
C:\Druva\Service\rmservice.exe

Here are all the Linux files that must be whitelisted: (The /opt/druva files are installed by Druva as part of the DR Failover operation)

/opt/druva/rm_startup.sh
/opt/druva/cli
/opt/druva/run_model.sh
/opt/druva/upload_logs.sh
/etc/rc.local
/etc/init.d/after.local

Add virtual machines to DR plan

A DR plan includes a group of virtual machines, the DR restore frequency and all the disaster recovery settings that help you perform a single click failover.

  1. When a VM is added to a DR plan, Phoenix automatically assigns a few default failover settings. The default settings are:
    1. instance_type = t2.micro
    2. public_ip = None
    3. private_ip = Auto Assign

      These settings can be used to spin up the VM from the DR copy in case of a failover. You can update these settings based on source VM configuration for optimum failover times.
  2. While configuring failover settings for VMs added to the DR plan, ensure that the instance type is not smaller than the virtual machine you are trying to failover. You can also use the auto-suggest instance type feature to let Phoenix choose the appropriate instance type.
  3. Ensure that the Recovery Point Actual (RPA) does not exceed the backup frequency duration. RPA is the time elapsed since the last successful VM snapshot that is available for failover. For more information, see Managing Recovery Point Actual.

DR restore 

DR restore (also referred to as DR copy)  is the process where the Phoenix AWS proxy reads the VM backup data from Druva cloud, replicates it to an EBS volume in the customer's AWS account, and creates an EBS snapshot of the EBS volume.  The frequency with which the data is replicated is defined in the DR plan.

  • Ensure that the retention period for backups of large virtual machines is longer than the time it can take to create the first full DR copy, that is, transfer the VM backup data from Druva cloud to the customer AWS account. The first DR restore can take longer. Subsequent incremental DR restores are faster.

DR Failover

Failover is the process where the DR proxy creates an EC2 instance in the customer’s EC2 account, creates an EBS volume from the EBS snapshot, attaches it to the EC2 instance, and finally spins up the instance after redirecting the network traffic to the IP addresses of the EC2 servers. A Linux VM failover can take between 15 to 30 minutes on average, while a Windows VM failover can take between 45-75 minutes. A failover can complete within the stipulated time provided the E2 instance type that is spawned from the EBS snapshot is the same type and size as the source virtual machine.

Test Failover

Druva recommends using the Test Failover option to periodically test VM failovers. You specify the production and test failover settings while creating the DR plan. As part of Test Failover Settings, you specify the instance type, the IAM role, Volume Type and Instance Tags. You can also use the same failover settings as used in Production.

On the Disaster Recovery page, select the DR Plan. On the Overview Page, click Failover > Test Failover. For more information, see Manage disaster recovery failover.

Failback

When you initiate a DR failback, the VMware backup proxy creates a target VM in the on-premise infrastructure. This target VM connects to the failed over EC2 instance and copies the data onto itself. Phoenix then boots up this VM.

  • Ensure that the target virtual machine in your on-premise environment to which you will failback has connectivity to the EC2 instance.
  • Ensure that the target virtual machine in your on-premise environment used for failback is reachable from the VMware backup proxy.
  • Ensure that the following ports are open on the target virtual machine:
    • Linux: Port 22 for SSH
    • Windows: Ports 445 (Used for preflight checks and control messaging) and 50000 (Used for actual data transfer in failback operation). 

Note: You must manually enable the SMB port for communication. See,  DR8263.

  • Ensure that the administrative shares of the source EC2 instance are reachable before attempting a failback. For more information, see error DR8263 and its resolution.

Billable AWS services

The following AWS services are deployed in your AWS account during the Phoenix AWS proxy deployment and are billable.

  1. The Amazon EC2 instance type (c5.2xlarge - recommended) used for the Phoenix AWS proxy.
  2. The following AWS VPC endpoints that are configured as part of proxy deployment:

    1. Druva Backup Service Endpoint

    2. Druva Node Service Endpoint

    3. S3 Endpoint

    4. SQS Endpoint

    5. EC2 Endpoint

    6. CloudFormation Endpoint

The AWS service costs are to be paid to AWS. For more information on the service costs, refer to Amazon EC2 pricing and AWS PrivateLink pricing.

  • Was this article helpful?