Premium HA - Data center disaster recovery from data center

Premium High Availability

Dynatrace Premium High Availability (Premium HA) is a self-contained, out-of-the-box solution that provides near-zero downtime and allows monitoring to continue without data loss in failover scenarios. This solution requires additional licensing for your deployment.

Short outages (up to three hours) of one data center do not require any recovery actions. When unaccessible data center becomes available again, Dynatrace Managed cluster will automatically synchronize data and restore cluster operations.

For longer outages (up to three days), first make sure that cluster nodes are operational and then execute the following command sequentially on all nodes in the recovering data center:

/opt/dynatrace-managed/utils/repair-cassandra-data.sh

For outages of a second data center for more than three days some data is lost and cannot be repaired. As a result, you must perform a recovery from either an operational data center, or from the backup.

Recover a data center from another data center

To recover a data center from another data center, you will:

  1. Remove unavailable nodes from the cluster.
  2. Update existing (surviving) data center configuration.
  3. Reinstall nodes in the recovered data center.
  4. Replicate Cassandra to recovered data center.
  5. Replicate Elasticsearch to recovered data center.
  6. Recreate the server, start ActiveGate, and start NGINX in the recovered data center.
  7. Enable the recovered data center.

For detailed procedure see Rebuild data center.