Skip to content

Asset Migration Script

This script migrates assets (documents/files) from old storage providers to new storage providers based on topic category configuration.

Purpose

When you update a topic category's document field configuration to use a new storage provider (e.g., migrating from GridFS to SharePoint), existing assets in topics of that category will still be stored in the old provider. This script migrates those existing assets to the new storage provider.

Usage

Before performing the actual migration, run the dry-run script to see what would be migrated:

./bin/cla repl --script repl-scripts/migrate_assets_dryrun.pl <category_name>

Actual Migration

After reviewing the dry-run output, perform the actual migration:

./bin/cla repl -c <config> --script repl-scripts/migrate_assets.pl <category_name> [limit]

Parameters

  • <category_name>: The name of the topic category whose assets you want to migrate
  • [limit]: Optional. Number of topics to process. Use this to test with a small number of topics first.

Example

# First, check what would be migrated
./bin/cla repl -c clarive --script repl-scripts/migrate_assets_dryrun.pl "Change Request"

# Test with just 1 topic first
./bin/cla repl -c clarive --script repl-scripts/migrate_assets.pl "Change Request" 1

# If successful, migrate all topics
./bin/cla repl -c clarive --script repl-scripts/migrate_assets.pl "Change Request"

What the Scripts Do

Dry Run Script (migrate_assets_dryrun.pl)

  1. Analyzes Without Changes: Examines all assets without modifying anything
  2. Shows Migration Plan: Lists which assets would be migrated
  3. Identifies Issues: Highlights assets with no target provider configured
  4. Safe to Run: Can be run multiple times with no side effects

Migration Script (migrate_assets.pl)

  1. Finds Topics: Locates all topics in the specified category
  2. Finds Assets: For each topic, finds all attached documents/assets
  3. Checks Configuration: Determines the target storage provider from the category's document field configuration
  4. Skips if Already Migrated: If an asset is already in the target provider, it skips it
  5. Migrates Files: For assets that need migration:
  6. Downloads the file from the old storage provider
  7. Uploads to the new storage provider
  8. Updates the asset CI with the new storage location and provider
  9. Removes the file from the old storage provider
  10. Reports Results: Provides a summary of:
  11. Total assets found
  12. Already migrated assets
  13. Successfully migrated assets
  14. Failed migrations with error details

Requirements

  • The category must have at least one document field configured with a storage_provider
  • The target storage provider must be properly configured and accessible
  • The script requires access to the Clarive database and storage providers

Output Example

======================================================================
Asset Migration Script
Category: Change Request
======================================================================
Found category: Change Request (ID: category-12345)
Found 10 topics in category 'Change Request'

----------------------------------------------------------------------
Processing topic: CR-001: Update Authentication (topic-98765)
  Found 3 asset(s)
    Asset: design_doc.pdf (MID: asset-11111)
      Current provider: GridFS (default)
      Field: documents
      Target provider MID: mssharepoint-site-1
      Target folder: ChangeRequests/{title}
      Status: Needs migration
      Starting migration...
        Downloading from old provider...
        Uploading to new provider...
        New storage ID: 01ABCDEFGHIJKLMNOPQRSTUVWXYZ
        Updated asset CI
        Removing from old provider...
        Migration completed ✓
    Asset: screenshot.png (MID: asset-22222)
      Current provider MID: mssharepoint-site-1
      Field: documents
      Target provider MID: mssharepoint-site-1
      Status: Already migrated ✓
    ...

======================================================================
Migration Summary
======================================================================
Total assets found:        30
Already migrated:          5
Successfully migrated:     23
Failed migrations:         2

Errors encountered:
  - Asset old_file.doc (asset-99999): Could not retrieve file data from old provider

Done!

Safety Features

  • The script checks if assets are already in the target provider before migrating
  • It provides detailed progress output for each asset
  • Failed migrations are reported but don't stop the script
  • Only removes files from the old provider after successful upload to the new provider
  • Optional limit parameter allows testing with a small number of topics (e.g., 1) before processing all topics
  • The script can be safely run multiple times - already migrated assets will be skipped

Troubleshooting

Error: "Category 'X' not found"

  • Verify the category name is spelled correctly
  • Category names are case-sensitive

Error: "Could not instantiate new provider"

  • Check that the storage provider CI is properly configured
  • Verify the provider MID exists in the database

Error: "Could not retrieve file data from old provider"

  • The file may have been manually deleted from the old provider
  • The old provider may be misconfigured or unreachable

Error: "Could not upload file to new provider"

  • Check the target storage provider configuration and credentials
  • Verify network connectivity to the storage provider (e.g., SharePoint)
  • Check storage provider permissions and quotas

Notes

  • The script matches each asset to its specific field configuration, ensuring the correct storage provider and folder are used
  • If a topic has multiple document fields with different providers, each asset will be migrated to its corresponding field's provider
  • GridFS is considered the "default" provider when storage_provider_mid is not set on an asset
  • Use the optional limit parameter to test with a small number of topics before running on all topics