Skip to main content
Druva Documentation

Storage Compaction failed alert for inSync On-Premises

 

Problem description

This article explains how to resolve a Storage Compaction Failed alert. inSync administrator receives this for a deleted user.

Cause

When a user has been deleted from the inSync Management Console by an inSync administrator in an inSync On-premise environment, there could be a scenario when a user’s compaction keeps on running and it keeps failing as the user’s entries are still present in the database. Now, due to this ambiguity, inSync administrator keeps on receiving an alert on daily basis for that specific user saying: ‘Storage Compaction Failed’.

The alert email would look like this:

clipboard_e6b6e05d4689a30c6270ebe234fe577f6.png

Traceback

{code}
[2020-04-15 00:12:37,891] [INFO] Starting compaction of device=10 user=12 session=49254570 in worker
[2020-04-15 00:12:37,905] [ERROR] Exception in compaction of session=49254570
[2020-04-15 00:12:37,905] [INFO] Picking up the StorageWorker_1:2732 from DEFAULT pool for DeletedDeviceCompaction taskid=49254571.
[2020-04-15 00:12:37,905] [INFO] Compacting device=6206 user=12 task=49254571 on worker=2732
[2020-04-15 00:12:37,921] [INFO] Starting compaction of device=6206 user=12 session=49254571 in worker
[2020-04-15 00:12:37,937] [ERROR] Exception in compaction of session=49254571
[2020-04-15 00:12:37,937] [INFO] Picking up the StorageWorker_1:6328 from DEFAULT pool for DeletedDeviceCompaction taskid=49254572.
[2020-04-15 00:12:37,953] [INFO] Compacting device=135618 user=12 task=49254572 on worker=6328
[2020-04-15 00:12:37,953] [INFO] Starting compaction of device=135618 user=12 session=49254572 in worker
[2020-04-15 00:12:37,969] [ERROR] Error <type 'exceptions.Exception'>:Device did=10 uid=12 is live. Traceback -Traceback (most recent call last):
File "inSyncWorker.py", line 1018, in __device_compact
File "drst\afs\acompact.pyc", line 434, in compact_cset
Exception: Device did=10 uid=12 is live
[2020-04-15 00:12:37,969] [INFO] Finished compaction of session=49254570
[2020-04-15 00:12:37,984] [ERROR] Exception in compaction of session=49254572
[2020-04-15 00:12:38,000] [INFO] Picking up the StorageWorker_1:5928 from DEFAULT pool for DeletedDeviceCompaction taskid=49254573.
[2020-04-15 00:12:38,000] [INFO] Deleting session=49254570, soid=10
[2020-04-15 00:12:38,000] [ERROR] Error <type 'exceptions.Exception'>:Device did=6206 uid=12 is live. Traceback -Traceback (most recent call last):
File "inSyncWorker.py", line 1018, in __device_compact
File "drst\afs\acompact.pyc", line 434, in compact_csetException: Device did=6206 uid=12 is live
{code}

Note: 

  1. You will find the above log in inSyncStorageServer.log only if it’s a dedicated storage node. If it's a local storage node, please check inSyncCloudServer.log and inSyncSyncServer.log. 
  2. You must find the above log based on the time frame mentioned in the alert.

Resolution

 You can perform these steps even while the inSync services are online (both inSync Master server and Storage node).

 

 

 

Step 1: Complete the following steps on inSync Master

  1. Open the command prompt. Ensure that you have the admin rights and the Druva service is up and online in inSync Master Server.
  2. Access: C:\Programfiles\Druva\inSync Server location using command prompt.
  3. Run this command: inSyncAFSDB.exe --name "<Storage name>" --dumpcfg

    Example: inSyncAFSDB.exe --name "StorageonNode" --dumpcfg
  4. Once done, a .cfg file will be created in “C:\Programfiles\Druva\inSync Server” location with storage’s ID i.e. SID i.e. Storage-<SID>.cfg

    Example:  If SID of concerned storage is 7, the file name would be storage-7.cfg. If SID of storage is 61, the file name would be storage-61.cfg

  5.  Copy the Storage-<SID>.cfg from the inSync Master server and paste it to the dedicated storage node that has the impacted storage node associated with it.

Step 2: Complete the following steps on inSync Storage Node

  1. Stop Druva inSync Storage node service from services.msc. Once done, make sure that the concerned inSync processes are not running, using the task manager. If not, kill the inSync processes after waiting for some time.
  2. Open the command prompt. Ensure that you have the admin rights.
  3. Access: “C:\Programfiles\Druva\inSyncStorageNode” location using command prompt.
  4.  Run this command: inSyncAFSDB.exe --remote --storeid=<SID of Storage> --loadcfg=<location of “Storage-<sid>.cfg” file you got from master server in above step>  --startbynamo

    Example: inSyncAFSDb.exe --remote --storeid=7 --loadcfg="C:\Users\<username>\Desktop\storage-7.cfg" 
  5. Hit Enter.
  6. Get the script ( “did_is_live.txt” ) from the following location once the command prompt is ready for you with the “>>>” sign. and
  7. Then copy the content of the “did_is_live.txt” file and paste it in the command prompt to avoid any indentation issue.
    Example: \\172.16.53.46\Store1\Storage Script for uid is live

    Note: Alternate to step 6, is to copy-paste the script file ( did_is_live.txt ) in the concerned storage node and use that in step no. 4 command with “--script” parameter.

    For Example: inSyncAFSDb.exe --remote --storeid=7 --loadcfg= <storage-<sid.cfg>> --script <path to did_is_live.txt> 

  8. Hit Enter.

  9. Once done, run below commands one by one: 
    >>> fs.flush()
    >>> os._exit(0)  

  10. Start Druva inSync Storage node services from the storage node.
    Make sure that the storage is in a healthy state. Keep monitoring it. The alert should not get generated from now on.

Verification

Once the above steps are complete successfully, the storage should come up healthy and from now on, inSync admin should stop receiving the ‘Storage Compaction Failed’ alert for uid=XXXX is live.