Introduction
With every new CX release, there might come a need to upgrade the data in the source MongoDB, which involves creating JavaScript files to be executed in the Mongo shell. As we transition from running data migrations directly in the Mongo shell to using the Expertflow Data Platform, teams now need the ability to configure and utilize the platform to perform their respective operations. This guide outlines all the essential steps required to set up the data platform for each data migration activity.
Configurations
Below is the data_migration_config.yaml, from where the migration activity is controlled
source:
type: "mongodb"
## Connection string: mongodb://{username}:{password}@{host}:{port}/?authSource=admin
connection_string: "mongodb://root:Expertflow123@192.168.2.202:31545/?authSource=admin"
## For batch processing, use this template
batch_processing:
conversation-manager: # migration to run
js_file: "Updated_4.4-4.5_upgrade_Bulk_Repeat.js"
start_date: "2025-01-01" # Should be updated according to data
end_date: "2025-01-30" # Should be updated according to data
interval: "720" ## minute-wise interval (0.5 day = 720)
## For non-batch processing, use this template
non_batch_processing:
RE_adminPanel:
js_file: "Updated_4.4-4.5_upgrade - RE and Admin.js"
tls: true # Set to false if you don't want to use TLS
tls_ca_file: "/transflux/certificates/mongo_certs/mongodb-ca-cert"
tls_cert_key_file: "/transflux/certificates/mongo_certs/client-pem" # Includes both client certificate and private key
the data migration is performed either in batches or just a single run (non-batch). The batch migration is sub-divided in timelines and intervals. Below is description for each variable
-
connection_string: the connection string from the source mongodb -
batch_processing:configure a batched pipeline for the provided information-
Within the batch_processing,
-
conversation-manager:This is the name assigned to a migration; it is used when creating a pipeline (DAG).-
js_fileName of the JS file to execute for respective migration -
start_datestart date of the data to process (consult from source mongodb) -
end_dateend date of the data till the data processes (consult from source mongodb) -
intervalMinute-wise interval that filters the data to be processed within a specific timeline
-
-
-
-
non_batch_processing:configure a non-batch pipeline for the provided information-
Within the non_batch_processing
-
RE_adminPanelThis is the name assigned to a migration; it is used when creating a pipeline (DAG).-
js_fileName of the JS file to execute for respective migration
-
-
-
-
tls: TLS flag that determines if the mongo database supports only TLS verified connection-
tls_ca_file: path for mongo-ca-cert file -
tls_cert_key_filepath for client-pem file
-
Pre-requisites
Before you get started
-
Make sure the desired JavaScript file has been tested on a test MongoDB server to avoid conflicts when running the activity through the data platform.
-
Make sure to create a feature branch from
developas per the respective ticket, the criterion for naming the feature is <current-release>_f-<jira_ticket_id> e.g4.8_f-BI-185
How to proceed
-
Paste your .js file in
mongo_migration_scriptsfolder -
Edit it and make sure that the connection to databases is handled by the method
db.getSiblingDB()i.e.conversationManagerDb = db.getSiblingDB("conversation-manager_db")// When using multiple databases in a single JS file reDb = db.getSiblingDB("routing-engine_db"); adminPanel = db.getSiblingDB("adminPanel");-
For batch processing pipelines, edit the .js file and place the following date filter in the beginning (this will change dynamically as per the start and end time declared in config file while the pipeline is running)
const input_start_dateTime = new Date("2025-01-01T00:00:00Z"); // Replace with your start date-time const input_end_dateTime = new Date("2025-01-31T23:59:59Z"); // Replace with your end date-time
-
-
Edit the file
config/data_migration_config.yaml, add your respective information in eitherbatch_processingornon_batch_processingas per the need of .js file execution pattern-
for
batch_processingbatch_processing: <migration_to_run>: # migration to run js_file: <js_file_name> start_date: <start_date> # Should be updated according to data end_date: <end_date> # Should be updated according to data interval: 720 ## minute-wise interval (0.5 day = 720) -
for
non_batch_processingnon_batch_processing: <migration_to_run>: js_file: <js_file_name>
-
-
Create a new dag file in
dagsfolder. the naming convention for setting up file name in dags folder is <migration_to_run>_migration_<batch_or_non_batch>_pipeline_dag.py -
Paste and adjust the following contents as required
-
for batch processing pipeline
import datetime from airflow import DAG from transflux.src.dag_factory.data_migration_batch_callable import create_dag_migration_batch DAG_ID = <dag_id> ## The name of the DAG pipeline shown in the airflow UI migration_to_run = <migration_to_run> ## From the data_migration_config.yaml as set dag = create_dag_migration_batch( dag_id=DAG_ID, migration_to_run=migration_to_run ) # Register the DAG globals()[DAG_ID] = dag -
for non-batch processing pipeline
import datetime from airflow import DAG from transflux.src.dag_factory.data_migration_non_batch_callable import create_dag_migration_non_batch DAG_ID = <dag_id> ## The name of the DAG pipeline shown in the airflow UI migration_to_run = <migration_to_run> ## From the data_migration_config.yaml as set dag = create_dag_migration_non_batch( dag_id=DAG_ID, migration_to_run=migration_to_run ) # Register the DAG globals()[DAG_ID] = dag
-
-
Deploy the solution on machine. The pipeline would appear as per the set <dag_id> in the …. migration_dag.py file
-
For the batch processing pipeline, turning the pipeline ON (side bar shown to left of pipeline name) will start running the pipeline operation
-
For non-batch processing pipeline, in addition to turning ON the pipeline, the pipeline should be triggered (play button shown to the right side of pipeline name)
For further details of operating migration pipelines from airflow UI, follow Migration Activity ( CX-4.7 to CX-4.8 )