Troubleshooting Instant Restore issues



This topic describes common workarounds for the issues that you might encounter while performing the following tasks:
- Restoring virtual machines instantly
-
Migrating instantly restored VMs to production
-
Deleting instantly restored VMs
-
Other issues
Commands for debugging instant restore jobs
You can use the following commands to debug the issues related to instant restore job:
Command | Description |
---|---|
ps -ef | grep PhoenixIRAgent |
Use to check if the IRAgent process is spawned on the Backup proxy. |
ps -ef | grep PhoenixIRService |
Use to check if the IRService is running on Phoenix CloudCache. |
ps -ef | grep PhoenixIRFS |
Use to check if the IRFS process is spawned on Phoenix CloudCache. |
/mnt/instantrestore |
NFS mount share on Backup proxy location. |
sqlite3 CCBmap.db |
Use to view the sqllite db CCBmap.db on CloudCache location /mnt/data/instantrestore/{internal_job id}/bmap. |
Common issues
The following are some of the common issues that you might face while performing instant restore of VMs, and migration, or deletion of instantly restored VMs:
Issue | Resolution |
---|---|
Migration of an instantly restored VM fails if you manually migrate the instantly restored VM to another datastore. | On the Instant Restored VMs page, select the VM for which migration failed and perform the manual cleanup steps. For more information, see Steps for cleaning the datastore manually. |
Migration of an instantly restored VM fails if another datastore is attached to the instantly restored VM apart from the instantly restored datastore. | Do either of the following:
|
Instant restore or migration to production fails when the operating system buffers consume a big portion of RAM due to which the NFS servers do not start, and the following traceback is shown on the terminal:
|
Run the following command to verify and fix the issue:root@cloudcache:~# free -m
|
The instant restore and migration job fails if the PhoenixIRFS (Fuse) process is not able to start and the log file shows entries similar to this:
|
|
Instant restore fails during exporting NFS share |
|
Delete custom command fails if the instantly restored datastore is not attached to the instantly restored VM. This indicates that the VM is already migrated manually or by the migration job. | Incase of manual migration, the instantly restored datastore gets detached but not deleted. You must manually clean up the datastore. For more information, see Steps for cleaning the datastore manually. |
If CloudCache service is restarted during instant restore or migration, the ongoing job might fail.For existing running instantly restored VMs which are not migrated or deleted, stopping or restarting the Phoenix Cache Server service kills all the IRFS processes running on CloudCache due to which the instantly restored VM and datastore goes in an inaccessible state. |
|
After installing the latest rpm manually, you get an error while migrating credentials during a client upgrade or phoenix service restart. | Set credentials again using vCenterDetail set command. |
For validating communication between the Backup proxy, IRAgent, and CC IRService, a token is generated in cacheconfig and passed in each request triggered by IRAgent. IRService decrypts this token and validates the request based on the cache id and time within the token. The request might fail during token decryption or token expiry. | Retrigger the instant restore or migration job. |
Unable to mount the datastore due max NFS datastore limit. Failed to mount to server 10.x.x.x mount point /mnt/nfs-share/subdir/subdir. NFS has reached the maximum number of supported volumes. |
This is due to a VMware/ESXi configuration. Please refer to VMware documentation for more details. Maximum supported volumes reached Increasing the default value that defines the maximum number of NFS mounts on an ESXi/ESX host |
Set custom export path on cloudcache
Perform the following steps to set the custom export path on Phoenix CloudCache:
-
Open the /etc/PhoenixCloudCache/PhoenixCloudCache.cfg file.
-
Set your desired path against the variable IR_CUSTOM_EXPORT_PATH.
-
Save the PhoenixCloudCache.cfg file and trigger the instant restore job.
Steps for cleaning the datastore manually
This topic covers the actions that you need to perform while cleaning the datastore manually.
Migration is successful with some errors
The following issues may occur after migrating an instantly restored VM to production:
Issue | Resolution |
---|---|
Deletion of datastore fails |
|
Clean up of the CloudCache fails | Perform CloudCache clean up manually. For more information, see Steps to cleanup the CloudCache machine. |
Deletion of instantly restored VM fails
The deletion of an instantly restored VM can fail in the following scenarios:
-
The instantly restored VM is already migrated to a different datastore.
-
Customer has attached a disk of a different datastore.
Resolution
-
Delete the instantly recovered VM from the vCenter if it is not migrated to production environment
-
Delete respective datastore from the datastore listing page.
-
Migrate the VM if present on the instantly restored datastore.
Steps to cleanup the CloudCache machine
After the instantly restored VM is deleted or migrated, perform the following steps to remove the respective entries from the cloudcache machine:
Step | Action |
---|---|
Remove specific export path from the /etc/exports file. | If the instantly restored datastore exists in the vCenter, check the path in datastore summary page: |
Unmount the export path by killing the IRFS process. |
|
Steps to recover a virtual machine
-
Spawn the IRFS process manually.
Search for the fuse command for respective command line params for fuse in the IRService log file.
export ROOT_DIR=”<path from log>”
export BMAP_DIR=”<path from log>”
Execute the command from log directly
Sample command -
nohup /usr/bin/PhoenixIRFS -f -o allow_other -o ccstoremapstr='{\"1\":\"/mnt/data/PhoenixCacheStore\"}' -o cachekey='MSK3AgSvLg91AQ/SErTniPuyv/+B4AzzOuLPNfktXn9mqIWmcYEZDf/u1VRjfv597x1uPvBGFjl3W7ELsRInBWuDz1eeFtLNu4wj96BJFhLYQisY+cpRMoffXtx9m+rHJ4PTC6BSq4dW+ygOka+cX4eC297IWxBtuM1kU47lKjLiHvAwv4EWSo7BVTT/ek8r' -o cacheid=1 -o storageid=3 -o csetid=1 -o jobid=5 -o cache_disabled=0 -o fips_enabled=0 -o auto_unmount /mnt/data/instantrestore/5/mnt &> /var/log/PhoenixCloudCache/irfs/1/libphoenixfs-5.log &
-
Perform the following steps depending on the VM power state:
VM Power state Steps OFF No action to be taken. ON - Power OFF VM from the vsphere console.
- If the VM goes to an orphan invalid state, remove and add the VM back to inventory by performing the following steps:
-
Check the datastore and VM folder that has the vmx file of the instantly restored VM.
-
Remove this VM by clicking Actions > Remove from Inventory.
-
From the datastore view, go to the respective VM folder.
-
Select the vmx file and click Register VM.
-
- Power ON the VM.