To restore a lost data center (DC) from backup in a Premium High Availability deployment, follow these steps.
This procedure uses the following terms:
The procedure migrates and replicates Dynatrace Managed components individually to prepare them for data replication across two DCs. See Managed components.
Uninstall Dynatrace Managed on all running nodes
Restore data center from backup
Remove lost data center from configuration
Distribute the installer
Prepare cluster data for replication
Create the data center topology
Open firewall rules
Install second data center nodes
Migrate Cassandra
Migrate Elasticsearch
Migrate the server
Enable the new data center
Collect the following information before running the API calls:
<seed-node-ip> - The IP address of the seed node from Source-DC.
The seed node can be any node running in an existing DC, used for performing the installation tasks and distributing configuration.
<nodes-ips> - The list of IPv4 addresses of new nodes in Target-DC.
Example: "176.16.0.5", "176.16.0.6", "176.16.0.7"
<api-token> - A valid Cluster API token (ServiceProviderAPI scope is required).
You can generate it in the Dynatrace Managed Cluster Management Console (CMC). See Cluster API - Authentication.
<dynatrace-directory> - The directory where Dynatrace Managed is installed on the seed node.
The default Dynatrace Managed installation directory is /opt/dynatrace-managed
<datacenter-1> - The Source-DC name must be the same as the Cassandra DC name.
The default Cassandra DC name is datacenter1.
To get the DC name, run this command on the seed node before starting migration:
sudo <dynatrace-directory>/utils/cassandra-nodetool.sh status
You will get a response that includes the Source-DC name. Example for a DC named datacenter1:
Datacenter: datacenter1=======================Status=Up/Down|/ State=Normal/Leaving/Joining/Moving-- Address Load Tokens Owns (effective) Host ID RackUN 10.176.42.20 65.54 GB 256 100.0% f053dd8d-ecf3-7834-b099-68542439817b rack1UN 10.176.42.244 65.47 GB 256 100.0% 2aa7e790-a423-9273-88f9-45bcd158dd6e rack1UN 10.176.42.168 65.47 GB 256 100.0% 48543bca-41f5-26d3-b2fd-6cfdf5c0f3b2 rack1
<datacenter-2> - The Target-DC name must remain unchanged. Example: dc-us-east-2.
You must use the same name of the lost DC during the recovery to Target-DC.
Set the following environment variables on the seed node in Source-DC and on every node in Target-DC:
SEED_IP=<seed-ip>DT_DIR=<dynatrace-directory>NODES_IPS=$(echo '[<nodes-ips]')API_TOKEN=<api-token>SDC_NAME=<datacenter-1>TDC_NAME=<datacenter-2>
For example:
SEED_IP=10.176.37.201DT_DIR=/opt/dynatrace-managedNODES_IPS=$(echo '["10.176.37.218", "10.176.37.227", "10.176.37.120"]')API_TOKEN=R_SZOpV4RTOmjr9fFmK4xSDC_NAME=datacenter1TDC_NAME=dc-us-east-2
If your Cassandra or Elasticsearch cluster is configured with custom.settings that enable rack-awareness, contact a Dynatrace product expert via live chat. Apply the custom settings before proceeding with Target-DC installation.
To check whether custom settings are applied, run on seed node:
ls $DT_DIR/installer/custom.settings
If the custom.settings file exists, you're using custom settings.
Each of the REST API calls will return the HTTP code. Go to the next step only when the returned code is 200. Expect the following return codes:
200 - The step completed successfully. Go to the next step.
207 - The request is in process. Retry after a few minutes.
40x - Revise your request path and arguments and repeat the request.
5xx - Contact support.
Follow the official procedure to remove a Managed Cluster node using either the command prompt or the CMC. See Remove a cluster node.
Follow the official procedure to restore the DC from the backup. See Back up and restore a cluster.
Run the following cluster API call only on the seed node:
curl -ikS -X POST https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/lostDatacenterCleanUp?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200 and the response doesn't suggest next steps, contact a Dynatrace product expert via live chat.
Sign in to the CMC.
Go to Home for the Dynatrace Managed deployment status page.
Select Add new cluster node.
Copy the wget command line from the Run this command on the target host text field.
The Run this installer script with root rights text field contains a command for the installation script. Ignore this command. Don't run the provided script.
Paste and run only the wget command line on every node in the Target-DC terminal window.
Run the following cluster API call only on the seed node:
curl -ikS -X POST https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/clusterReplicationPreparation?Api-Token=$API_TOKEN
If the status code isn't 200 and the response doesn't suggest next steps, contact a Dynatrace product expert via live chat.
Run the following cluster API call only on the seed node:
curl -ikS -X GET https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/clusterReplicationPreparation?Api-Token=$API_TOKEN -H "accept: application/json"
If the status code from this call isn't 200, try again after a few minutes.
Run the following cluster API call only on the seed node:
curl -ikS -X POST -d "{\"newDatacenterName\" : \"$TDC_NAME\", \"nodesIp\" :$NODES_IPS}" https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/datacenterTopology?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200 and the response doesn't suggest next steps, contact a Dynatrace product expert via live chat.
To open ports to traffic from the new Target-DC nodes, run the following cluster API call only on the seed node:
curl --noproxy '*' -ikS -X POST -d "$NODES_IPS" https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/clusterNodes/currentDc?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If successful, the status code is 200 and the response body will contain a request ID you need to check the firewall rules status.
If the status code isn't 200 and the response doesn't suggest next steps, contact a Dynatrace product expert via live chat.
Set the request ID environment variable on seed node only. The request ID is from the response in the previous API call.
REQ_ID=<topology-configuration-request-id>
To check the firewall rules status, run the following cluster API call only on the seed node:
curl -ikS https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/clusterNodes/currentDc/$REQ_ID?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code from this call isn't 200, try again after a few minutes.
Run the following command on every node in Target-DC. Follow the installation prompts as this will be a typical node installation.
sudo /bin/sh ./managed-installer.sh --install-new-dc --premium-ha on --datacenter $TDC_NAME --seed-auth $API_TOKEN
The installation takes from 3 through 5 minutes. The expected result is similar to this:
Installation in new data center completed successfully after 2 minutes 51 seconds.
Run the following cluster API call only on the seed node when all nodes in Target-DC finish installing:
curl -ikS https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/nodekeeper/healthCheck?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200, try again after a few minutes.
Cassandra migration may take minutes to hours depending on your metric storage size.
To start migration of Cassandra in Target-DC, run the following cluster API call only on the seed node:
curl -ikS -X POST https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/cassandra/newDc?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If successful, the status code is 200 and the response body will contain a request ID which you need to check migration status. Set the request ID environment variable only on the seed node. The request ID is from the response in the previous API call.
REQ_ID=<migration-new-datacenter-request-id>
If the status code isn't 200 and the response doesn't suggest next steps, contact a Dynatrace product expert via live chat.
To check the migration status, run the following cluster API call only on the seed node:
curl -ikS -X GET https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/cassandra/newDc/$REQ_ID?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200, try again after a few minutes.
Depending on the size of your Cassandra database, this process can take several hours.
To rebuild Cassandra, run the following command on each new Target-DC node successively. Use the nohup command to prevent interruption of script execution (such as session disconnect) during important operations.
sudo nohup $DT_DIR/utils/cassandra-nodetool.sh rebuild -- $SDC_NAME &
To verify the progress and status, run the following cluster API call only on the seed node:
curl -ikS -X GET https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/cassandra/rebuildStatus?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200, try again after approximately 15 minutes. Remember that the rebuild process can be time-consuming.
To verify Cassandra cluster state, run the cassandra-nodetool.sh with the status parameter only on the seed node:
sudo $DT_DIR/utils/cassandra-nodetool.sh status
The result should look similar to this:
Datacenter: dc1===============Status=Up/Down|/ State=Normal/Leaving/Joining/Moving-- Address Load Tokens Owns (effective) Host ID RackUN 10.176.41.167 18.82 GB 256 100.0% 3af25127-4f99-4f43-afc3-216d7a2c10f8 rack1UN 10.176.41.154 19.44 GB 256 100.0% 5a618559-3a73-42ec-83f0-32d28e08beec rack1UN 10.176.41.43 19.58 GB 256 100.0% 191f3b30-949a-4cf2-b620-68a40eebf31e rack1Datacenter: dc2===============Status=Up/Down|/ State=Normal/Leaving/Joining/Moving-- Address Load Tokens Owns (effective) Host ID RackUN 10.176.42.54 19.18 GB 256 100.0% 852ce236-a430-400a-92a6-daeed99acf68 rack1UN 10.176.42.104 19.12 GB 256 100.0% 84479219-b64d-442c-a807-a832db9aae18 rack1UN 10.176.42.234 19.4 GB 256 100.0% 507b377c-5bfc-4667-b251-a9b7c453ed22 rack1
The Load value shouldn't differ significantly between the nodes and Status should be UN on all nodes.
Depending on the size of your Cassandra database, this can take several hours.
To rebuild Cassandra data in Target-DC, run the following cluster API call only on the seed node:
curl -ikS -X POST https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/cassandra/rebuild?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If successful, the status code is 200. If the status code isn't 200 and the response doesn't suggest the following steps, contact a Dynatrace product expert via live chat within your Dynatrace environment.
To check the rebuild data status, run the following cluster API call only on the seed node:
curl -ikS -X GET https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/cassandra/rebuild?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200, try again after approximately 15 minutes. Remember that the rebuilding data process can be time-consuming.
If the response has an error flag set to true, contact a Dynatrace product expert via live chat within your environment.
Elasticsearch migration may take minutes or hours depending on your Elasticsearch storage.
Start Elasticsearch. Run the following command successively on every node in Target-DC only:
sudo $DT_DIR/launcher/elasticsearch.sh start
To start migration of Elasticsearch to Target-DC, run the following cluster API call only on the seed node:
curl -ikS -X POST https://$SEED_IP/api/v1.0/onpremise/multiDc/restore/elasticsearch/recover?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If successful, the status code is 200.
To check the migration status of Elasticsearch, run the following cluster API call only on the seed node:
curl -ikS -X GET https://$SEED_IP/api/v1.0/onpremise/multiDc/restore/elasticsearch/recover?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200, try again after a few minutes.
To verify Elasticsearch data migration, run the following cluster API call only on the seed node:
curl -ikS -X GET https://$SEED_IP/api/v1.0/onpremise/multiDc/migration/elasticsearch/indexMigrationStatus?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200, try again after a few minutes.
Launch the Managed Cluster in Target-DC by running the following cluster API call only on the seed node:
curl -ikS -X POST https://$SEED_IP/api/v1.0/onpremise/multiDc/restore/server/recovery?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If successful, the status code is 200 and the response body will contain a request ID which you need to check cluster readiness. Set the request ID environment variable only on the seed node. The request ID is from the response in the previous API call.
REQ_ID=<migration-server-request-id>
If the status code isn't 200 and the response doesn't suggest next steps, contact a Dynatrace product expert via live chat.
To check if the Managed Cluster is ready, run the following cluster API call only on the seed node:
curl -ikS -X GET https://$SEED_IP/api/v1.0/onpremise/multiDc/restore/server/recovery/$REQ_ID?Api-Token=$API_TOKEN -H "accept: application/json" -H "Content-Type: application/json"
If the status code isn't 200, try again after a few minutes.