Skip to main content
Druva Documentation

About recovery workflow

Phoenix Editions: File:/cross.pngBusiness         File:/tick.png Enterprise     File:/tick.pngElite
(Purchase Separately)

Phoenix DRaaS provides a robust disaster recovery system to ensure that your data is always protected and should the primary system fail, you have a well-planned failover mechanism that can take care of your data’s safety. By using the advanced orchestration capabilities of Phoenix DRaaS, you can exercise more control in the failover process of your virtual machines in the production and test environments. Let us walk you through the details of how it works.

Orchestration workflow

The recovery workflow helps you control the failover process better by letting you customize it based on your preferences and ensures a faster and more reliable failover. The following diagram depicts how to orchestrate a failover process.

Recovery workflow diagram.PNG 

  • Step 1: Create a DR plan
    To start with, you create a DR plan to protect your application comprising multiple virtual machines, for example, a web application, which would have database servers, application servers, and web servers (front-end servers).
  • Step 2: Edit the recovery workflow
    The recovery workflow works as a runbook and can be executed based on your application’s requirements. It enables you to define steps to logically group the virtual machines in a DR plan to perform operations in a defined order. When you create a DR plan, Phoenix DRaaS makes your job a little easier by automatically creating a default recovery workflow for the DR plan and adds all the virtual machines in the DR plan to a single VM boot step. You can go ahead and review the default recovery workflow in the Recovery Workflow section on the Recovery tab on the DR plan details page of the Phoenix Management Console. You can edit the default recovery workflow based on your failover requirements by adding multiple recovery steps, such as virtual machine (VM) boots and time delays. Read on to know more about VM boots and time delays:
    • VM Boot: Enables you to define a boot order for a group of virtual machines in the DR plan and add scripts in the recovery path to perform critical operations, such as change hostnames of the EC2 instances, modify service configurations, add or remove network routes and gateways, and so on. 
    • Time Delay: Enables you to add a time delay between the execution of the two steps during the recovery operation.
      For more information about what all can you configure using the recovery workflow, see Customizing the recovery workflow.
  • Step 3: Launch failover
    After you have customized the recovery flow, trigger the failover operation to recover your application's virtual machines in the AWS account based on the configuration and failover settings specified in the DR plan. For more information about how to launch a failover in the production or test setup, see Launch disaster recovery failover.

Customizing the recovery workflow

The recovery workflow enables you to perform the following tasks:

  • Add VM boot steps to the recovery workflow and specify their sequence of execution.   
  • Add on-boot scripts for execution after virtual machines boot up.
  • Specify a timeout for the execution of the script.
  • Enable abort settings for a step if you want to cancel a failover job when failover of any virtual machines fails during that step.
  • Add time delays between the execution of two steps.
  • Move virtual machines from one boot step to another boot step, as required.
  • Change the order of the execution of the steps. You can simply drag and drop a step to change the order of the execution.
  • Remove a VM boot step from the recovery workflow.  Before removing a VM boot step from the workflow, ensure that you first move all virtual machines from that step to another VM boot step.
  • Remove a time delay step from the recovery workflow.

For more information about how to edit a recovery workflow of a DR plan, see Edit failover recovery workflow.

Reference configuration

Let us consider an example to know more about recovery workflow. Consider your Web application comprises virtual machines, such as database servers, application servers, and Web servers (front-end servers). The Web application requires database servers to be online and running before its application servers and Web servers start up. You want to add a post-boot script to one of the Web servers to update the DNS records to point to the failover EC2 instance and continue the failover operation only upon the successful execution of the script.

Use the following steps to implement the recovery workflow for your Web application:

  1. Create a DR plan and add the following virtual machines to the DR plan:
  • Database servers: Database_Server_1 and Database_Server_2
  • Application servers: Application_Server_1 and Application_Server_2
  • Web servers: Web_Server_1 and Web_Server_2
    For more information, see Create a disaster recovery plan.

    |View larger image|

  1. Edit the disaster recovery workflow based on your Web application requirements.
    1. Define Step 1: VM Boot to boot your two database servers. 
    2. Define Step 2: VM Boot to boot the two application servers.
    3. Define Step 3: VM Boot to boot the two Web servers. Add a post-boot script to the Web_Server_2 virtual machine to update the DNS record and abort failover for this virtual machine on the failure of the script execution.
  2. Launch failover for the DR plan. For more information about how to launch a failover in the production or test setup, see Launch disaster recovery failover.

After you launch failover, you can track the status of failover in the Recovery Workflow tab of the Job Details page. The following screenshot depicts the orchestration of the virtual machines in your application.

Recovery workflow behavior 

During failover, Phoenix DRaaS converts all virtual machines configured in a DR plan in parallel. When you trigger a failover for your DR plan, the virtual machines in all the VM boot steps defined in the recovery workflow undergo parallel conversion ensuring reduced RTOs. As a result, you may find a few EC2 instances in the stopped state until its corresponding step starts executing.

The following screenshot depicts the parallel conversion of virtual machines in a DR recovery workflow.


The [EARLY] tags before the virtual machine name in the above screenshot depict the parallel conversion of all the virtual machines configured in the DR plan. 

  • Was this article helpful?