Asset Migration Script
This script migrates assets (documents/files) from old storage providers to new storage providers based on topic category configuration.
Purpose¶
When you update a topic category's document field configuration to use a new storage provider (e.g., migrating from GridFS to SharePoint), existing assets in topics of that category will still be stored in the old provider. This script migrates those existing assets to the new storage provider.
Usage¶
Dry Run (Recommended First)¶
Before performing the actual migration, run the dry-run script to see what would be migrated:
./bin/cla repl --script repl-scripts/migrate_assets_dryrun.pl <category_name>
Actual Migration¶
After reviewing the dry-run output, perform the actual migration:
./bin/cla repl -c <config> --script repl-scripts/migrate_assets.pl <category_name> [limit]
Parameters¶
<category_name>: The name of the topic category whose assets you want to migrate[limit]: Optional. Number of topics to process. Use this to test with a small number of topics first.
Example¶
# First, check what would be migrated
./bin/cla repl -c clarive --script repl-scripts/migrate_assets_dryrun.pl "Change Request"
# Test with just 1 topic first
./bin/cla repl -c clarive --script repl-scripts/migrate_assets.pl "Change Request" 1
# If successful, migrate all topics
./bin/cla repl -c clarive --script repl-scripts/migrate_assets.pl "Change Request"
What the Scripts Do¶
Dry Run Script (migrate_assets_dryrun.pl)¶
- Analyzes Without Changes: Examines all assets without modifying anything
- Shows Migration Plan: Lists which assets would be migrated
- Identifies Issues: Highlights assets with no target provider configured
- Safe to Run: Can be run multiple times with no side effects
Migration Script (migrate_assets.pl)¶
- Finds Topics: Locates all topics in the specified category
- Finds Assets: For each topic, finds all attached documents/assets
- Checks Configuration: Determines the target storage provider from the category's document field configuration
- Skips if Already Migrated: If an asset is already in the target provider, it skips it
- Migrates Files: For assets that need migration:
- Downloads the file from the old storage provider
- Uploads to the new storage provider
- Updates the asset CI with the new storage location and provider
- Removes the file from the old storage provider
- Reports Results: Provides a summary of:
- Total assets found
- Already migrated assets
- Successfully migrated assets
- Failed migrations with error details
Requirements¶
- The category must have at least one document field configured with a
storage_provider - The target storage provider must be properly configured and accessible
- The script requires access to the Clarive database and storage providers
Output Example¶
======================================================================
Asset Migration Script
Category: Change Request
======================================================================
Found category: Change Request (ID: category-12345)
Found 10 topics in category 'Change Request'
----------------------------------------------------------------------
Processing topic: CR-001: Update Authentication (topic-98765)
Found 3 asset(s)
Asset: design_doc.pdf (MID: asset-11111)
Current provider: GridFS (default)
Field: documents
Target provider MID: mssharepoint-site-1
Target folder: ChangeRequests/{title}
Status: Needs migration
Starting migration...
Downloading from old provider...
Uploading to new provider...
New storage ID: 01ABCDEFGHIJKLMNOPQRSTUVWXYZ
Updated asset CI
Removing from old provider...
Migration completed ✓
Asset: screenshot.png (MID: asset-22222)
Current provider MID: mssharepoint-site-1
Field: documents
Target provider MID: mssharepoint-site-1
Status: Already migrated ✓
...
======================================================================
Migration Summary
======================================================================
Total assets found: 30
Already migrated: 5
Successfully migrated: 23
Failed migrations: 2
Errors encountered:
- Asset old_file.doc (asset-99999): Could not retrieve file data from old provider
Done!
Safety Features¶
- The script checks if assets are already in the target provider before migrating
- It provides detailed progress output for each asset
- Failed migrations are reported but don't stop the script
- Only removes files from the old provider after successful upload to the new provider
- Optional limit parameter allows testing with a small number of topics (e.g., 1) before processing all topics
- The script can be safely run multiple times - already migrated assets will be skipped
Troubleshooting¶
Error: "Category 'X' not found"¶
- Verify the category name is spelled correctly
- Category names are case-sensitive
Error: "Could not instantiate new provider"¶
- Check that the storage provider CI is properly configured
- Verify the provider MID exists in the database
Error: "Could not retrieve file data from old provider"¶
- The file may have been manually deleted from the old provider
- The old provider may be misconfigured or unreachable
Error: "Could not upload file to new provider"¶
- Check the target storage provider configuration and credentials
- Verify network connectivity to the storage provider (e.g., SharePoint)
- Check storage provider permissions and quotas
Notes¶
- The script matches each asset to its specific field configuration, ensuring the correct storage provider and folder are used
- If a topic has multiple document fields with different providers, each asset will be migrated to its corresponding field's provider
- GridFS is considered the "default" provider when
storage_provider_midis not set on an asset - Use the optional limit parameter to test with a small number of topics before running on all topics