Skip to main content
Druva Documentation

Scanner CLI utility

Use the Scanner CLI utility

The Scanner CLI utility allows you to analyze the file system objects and get insight into the file and directory structure. This utility is bundled with the Phoenix agent. So, when you install the latest version of the Phoenix agent, you get access to the Scanner CLI utility. You can get information, such as the number of folders and files present, directory and file level, and data changed rate. When you run the Scanner CLI utility for the first time, a full scan is performed and all the subsequent scans may be incremental or full based on the configuration parameter specified in the configuration file. You will notice a significant improvement in the incremental scans performed after the first full backup.

Procedure

  1. Create a configuration file in the YAML format by copying the following snippet to a text file and saving the file in the YAML format.
    root_paths: [<root path1>, <root path2>]
    fset_dir: <Specify fset directory path>
    scan_worker_count: 50
    sqlite_n_conns: 8
    results_threshold: 10000
    results_file: <Specify the location where the result file will be saved>
    ignore_usn: false
    smart_scan: false
    force_scan: false
    ss_age_threshold: 0
    skip_acl: true
    log_file: <Specify the location where the log file will be saved>
    db_file_path: <Specify the database file path>
    filters: 
        exclude_folders: [<folder name>, <folder name>]
        exclude_extensions: ""
        include_extensions: ""
    
    Parameter Description
    root_paths Specify the absolute or full path of the directories that you want to scan.
    fset_dir Specify the drive letter.  For Windows, the fset directory is '<Drive letter:\>' and for Linux  '/'.
    scan_worker_count The number of threads that are to be used for scanning. The default scan worker count is 50 (recommended).  
    sqlite_n_conns The number of connections to be established with SQLite. The recommended number of connections is 8.
    results_threshold Set this value to 10000 (recommended).
    results_file Specify the location of the results file, which will contain information about the changed data.  A timestamp is appended to the results file name after each Scanner CLI utility run. 
    ignore_usn Set to 'false'.
    smart_scan Set to 'true' if you want to enable smart_scan. Smart scan optimizes the scanning duration for backup. To know more about smart scan, refer to Enable Smart Scan.
    force_scan Set to 'true' if you want to run a full scan forcefully instead of an incremental scan. The recommended value is 'false'.
    ss_age_threshold Age threshold (in days) for the directory to be eligible for smart scan. To know more about smart scan, refer to Enable Smart Scan.
    skip_acl Set to 'false' to skip detecting the Access Control Lists (ACLs) changes. 
    log_file Specify the location where you want to save the scanner log files. 
    db_file_path Specify the location of the file which will be used to store the persistent state of the scanner.

    exclude_folders

    List of folders to be excluded from the scan. For example,

    exclude_folders: [dev, /proc, /etc, Phoenix]

    exclude_extensions

    List of file extensions to be excluded from the scan. The extensions must be separated by a semicolon. For example,

    exclude_extensions: "*.log;*.bat"

    include_extensions

    List of file extensions to be included in the scan. The extensions must be separated by a semicolon. For example, 

    include_extensions: “*.doc;*.pdf;*.bat"

    Note: If a file extension is added to the include list and exclude list, then the file extension will be excluded as the exclusion takes precedence over inclusion.

  1. Download and install the latest version of the Phoenix agent from the Downloads page.
  2. In case of Linux, increase the file descriptor (FD) limit by using the following command:
    ulimit -n 65000
  3. Run the following command:
    scanner-cli.exe <Configuration file path>

Review scan result

Once the scan is complete, 

  • An output file is generated at the location specified in the configuration file. The result file contains the following  information about the changed data:

    Changed_data_info.png

    ChangeType - Indicates the type of change, such as file added, file modified, file deleted.
    ItemType - Indicates the type of file: 'F' indicates a file, 'D' indicates a directory, and 'L' indicates link.
    Mode - Indicates the Standard OS File Mode (uint32).
    MTime - Indicates the modification time of the file or the folder.
    Size - Indicates the size of the file in bytes.
    Path - Indicates the full path of the file.

  • A log file is generated at the location specified in the configuration file. The output file contains the following telemetry information.

    ScannerCLI_New_Output.png
     
    Parameter Description
    FileIneligible Shows the count of the files that were excluded from the scan.
    FolderIneligible Shows the count of the folders that were excluded from the scan.
    FileDistribution Shows the size distribution of files in a backup set. A [0-1KB : 1] indicates that only a single file with a file size between 0 to 1 KB was encountered during the scan.
    FileDistributionChanged Shows the size distribution of modified files after the previous run. 
    FolderAdded Shows the count of files that were added after the previous run. 
    FolderDeleted Shows the count of folders that were deleted after the previous run.
    FolderMissed Shows the count of folders that could not be scanned for any reason.
    MaxDepth Shows the maximum depth of the directory tree. 
    AvgDepth Shows the average depth of the directory tree. 
    ReparsePointFiltered Shows the count of unsupported reparse points.
    FileDeleted Shows the count of files that were deleted after the previous run.
    FolderScanned Shows the count of folders encountered during the scan.
    FolderSmartScanned Shows the count of directories smart scanned during backup.
    ChangedData Shows the size (in bytes) of changed data after the previous run. 
    MaxWidth Shows the maximum count of files and folders in a single directory among all the directories.
    AvgWidth Shows the average count of files and folders in a single directory among all the directories.
    FileModified Shows the count of files that were modified after the previous run. 
    ScanDuration Shows the total scan duration (in seconds).
    FileExtension Shows the distribution and count of files based on the extensions available in the backup set.
    FileExtensionChanged Shows the distribution of files based on the extensions that were added or modified after the previous run.
    FileAdded Shows the count of new files that were added after the previous run.
    FileMissed Shows the count of files that could not be scanned for any reason.  
    FolderModified Shows the count of folders that were modified after the previous run.
    FileScanned Shows the count of files encountered during the scan. 
    ScanRate Shows the rate (in files per second) with which the files were scanned.
    ACLChangeOnly Shows the count of files for which only ACL or Permissions has changed after the previous run. 
  • Was this article helpful?