The recovery workflow works as a runbook and can be executed based on your application’s requirements. It enables you to define steps to logically group the virtual machines in a DR plan to perform operations in a defined order. When you create a DR plan, Phoenix DRaaS makes your job a little easier by automatically creating a default recovery workflow for the DR plan and adds all the virtual machines in the DR plan to a single VM boot step. You can go ahead and review the default recovery workflow in the Recovery Workflow section on the Recovery tab on the DR plan details page of the Phoenix Management Console. You can edit the default recovery workflow based on your failover requirements by adding multiple recovery steps, such as virtual machine (VM) boots and time delays. Read on to know more about VM boots and time delays:
Steps in a recovery flow
|Add Recovery Step||Enables you to edit the default recovery workflow based on your failover requirements by adding multiple recovery steps, such as virtual machine (VM) boots and time delays.|
|Time Delay step||Enables you to add a time delay between the execution of the two steps during the recovery operation.|
|VM Boot Step||Enables you to define a boot order for a group of virtual machines in the DR plan and add scripts in the recovery path to perform critical operations, such as change hostnames of the EC2 instances, modify service configurations, add or remove network routes and gateways, and so on.|
Customizing the recovery workflow
The recovery workflow enables you to perform the following tasks:
- Add VM boot steps to the recovery workflow and specify their sequence of execution.
- Add on-boot scripts for execution after virtual machines boot up.
- Specify a timeout for the execution of the script.
- Enable abort settings for a step if you want to cancel a failover job when failover of any virtual machines fails during that step.
- Add time delays between the execution of two steps.
- Move virtual machines from one boot step to another boot step, as required.
- Change the order of the execution of the steps. You can simply drag and drop a step to change the order of the execution.
- Remove a VM boot step from the recovery workflow. Before removing a VM boot step from the workflow, ensure that you first move all virtual machines from that step to another VM boot step.
- Remove a time delay step from the recovery workflow.
Let us consider an example to know more about recovery workflow. Consider your Web application comprises virtual machines, such as database servers, application servers, and Web servers (front-end servers). The Web application requires database servers to be online and running before its application servers and Web servers start up. You want to add a post-boot script to one of the Web servers to update the DNS records to point to the failover EC2 instance and continue the failover operation only upon the successful execution of the script.
Use the following steps to implement the recovery workflow for your Web application:
- Create a DR plan and add the following virtual machines to the DR plan:
- Database servers: Database_Server_1 and Database_Server_2
- Application servers: Application_Server_1 and Application_Server_2
- Web servers: Web_Server_1 and Web_Server_2
- Edit the disaster recovery workflow based on your Web application requirements.
- Define Step 1: VM Boot to boot your two database servers.
- Define Step 2: VM Boot to boot the two application servers.
- Define Step 3: VM Boot to boot the two Web servers. Add a post-boot script to the Web_Server_2 virtual machine to update the DNS record and abort failover for this virtual machine on the failure of the script execution.
- Launch failover for the DR plan. For more information about how to launch a failover in the production or test setup, see Launch disaster recovery failover.
After you launch failover, you can track the status of failover in the Recovery Workflow tab of the Job Details page. The following screenshot depicts the orchestration of the virtual machines in your application.
Recovery workflow behavior
During failover, Phoenix DRaaS converts all virtual machines configured in a DR plan in parallel. When you trigger a failover for your DR plan, the virtual machines in all the VM boot steps defined in the recovery workflow undergo parallel conversion ensuring reduced RTOs. As a result, you may find a few EC2 instances in the stopped state until its corresponding step starts executing.
The following screenshot depicts the parallel conversion of virtual machines in a DR recovery workflow.
The [EARLY] tags before the virtual machine name in the above screenshot depict the parallel conversion of all the virtual machines configured in the DR plan.