In a standard high-availability Dynatrace Managed deployment, you are protected against data loss if:
Small Dynatrace Managed deployments—allow for one node failure.
Large Dynatrace Managed deployments—allow for two node failures.
Dynatrace Managed rack aware deployment allows you to group cluster nodes into three fault domains (i.e., racks). Such deployment is resilient to an outage of all nodes in a rack.
You should use rack awareness only if:
Otherwise, you may lose data and have issues with cluster availability.
Rack aware deployment ensures that no replica is stored redundantly inside a singular rack, so that replicas are spread around through racks. In case one rack goes down, the other two full replicas are available, ensuring data consistency and availability. For example, in the deployment below, the Dynatrace Managed cluster can handle up to three node failures in a rack before data loss.
In a standard Dynatrace Managed high availability deployment, you need at least three cluster nodes in order to prevent data loss. Similarly, in rack aware deployments, you must have three racks (fault domains) to prevent data loss. In an event where the rack fails, the surviving two racks maintain the data. Given that the rack contains at least three nodes, in rack aware deployments, you can afford a failure of the entire rack and still maintain data integrity.
The same concept applies to Premium High Availability Managed deployments. Using rack aware Managed clusters in separate data centers increases your resilience to data loss.
Premium high availability Managed deployment.
Premium high availability rack aware Managed deployment.
For the ultimate high availability and redundancy, use the Premium high availably deployment that is rack aware.
To create a rack aware deployment during the initial Managed deployment, use the installation parameters to indicate the data center and the rack to which to add the node. See Set up a cluster and Customize installation for Dynatrace Managed for example:
dynatrace-managed.sh --rack-name az-1 --rack-dc datacenter1
Use either the Cluster expansion or Cluster restore method to convert the existing Managed deployment.
You can scale vertically nodes in two locations so they can handle additional load when you terminate the 3rd location and reinstall with rack-aware settings. See Rack aware conversion using replication.
If your current metric storage (Cassandra database) per node is more than 1TB, use the cluster restore method. While the cluster expansion method will work, the Cassandra bootstrapping required in this method may take unreasonably long time.
You can backup and restore with rack aware settings. See Rack aware conversion using restore.