Managed hardware requirements

This topic explains the hardware for installing Dynatrace Managed. For other Dynatrace Managed requirements, see Managed system requirements and Managed hardware recommendations for cloud deployments.

Sizing considerations

Sizing generally consists of these elements:

Be sure to consider each element before proceeding.

General planning

It's not always possible to provision nodes that are sized exactly right, particularly if your environment is subject to ever-increasing traffic levels. While it's useful to do upfront analysis of required size, it's more important to have the ability to add more capacity to your Dynatrace Managed cluster should your monitoring needs increase in the future. To leverage the full benefits of the Dynatrace Managed architecture, be prepared to scale along the following dimensions:

  • Horizontally by adding more nodes. We support installations of up to 30 cluster nodes.
  • Vertically by provisioning more RAM/CPU per node.
  • In terms of data storage, by being able to resize the disk volumes as required (for guidelines regarding recommended disk setup see below).

For cloud deployments, use the recommended virtual machine equivalents for Managed hardware recommendations for cloud deployments

Hardware requirements

The hardware requirements included in the following table are estimates based on typical environments and load patterns. Requirements for individual environments may vary. Estimates for specific columns take into account the following:

  • Minimum node specifications
    CPU and RAM must be exclusively available for Dynatrace. Power saving mode for CPUs must be disabled. CPUs must run with a clock speed of at least 2GHz and the host should have at least 32GB of RAM assigned to it.

  • Transaction Storage
    Transaction data is distributed across all nodes and isn't stored redundantly. In multi-node clusters, transaction data storage is divided by the number of nodes.

  • Long-term Metrics Store
    For multi-node installations, three copies of the metrics store are saved. For four or more nodes, the storage requirement per node is reduced.

    You should treat the 4 TB requirement for the XLarge node as the maximum acceptable size. If you need more capacity, consider adding another node. Plan your long-term metrics store for data being a maximum of 50% of your available disk space. In these terms, 4 TB of space would handle 2 TB of your long-term metrics store data. While stores larger than 4 TB are possible, they can make database maintenance problematic.

Dynatrace Managed

Node Type

Max host units1
monitored

(per node)

Peak user
actions/min
(per node)

Min node
specifications

Disk IOPS
(per node)

Transaction Storage
(10 days code visibility)

Long-term
Metrics Store

(per node)

Elasticsearch
(per node)
(35 days retention)

Micro

50

1000

4 vCPUs,
32 GB RAM2

500

50 GB

100 GB

50 GB

Small

300

10000

8 vCPUs,
64 GB RAM

3000

300 GB

500 GB

500 GB

Medium

600

25000

16 vCPUs,
128 GB RAM

5000

600 GB

1 TB

1.5 TB

Large

1250

50000

32 vCPUs,
256 GB RAM

7500

1 TB

2 TB

1.5 TB

XLarge3

2500

100000

64 vCPUs,
512 GB RAM

10000

2 TB

4 TB

3 TB

1

The size of a host for licensing purposes (based on the amount of RAM provided by a host). The size of a host (in other words, the number of host units that a host is comprised of for consumption calculations) is based on the number of GBs of RAM available on the host server. The advantage of this approach is its simplicity; technology-specific factors are not taken into consideration (for example, the number of JVMs or the number of microservices that are hosted on a server). It doesn't matter if a host is .NET-based, Java-based, or something else. You can have 10 JVMs or 1,000 JVMs; such factors don't affect the amount of monitoring that an environment consumes. For full details, see Application and Infrastructure Monitoring (Host Units).

2
3

While Dynatrace Managed runs resiliently on instances with 1 TB+ RAM/128 cores (2XLarge) and allows you to monitor more entities, it's not the optimal way of utilizing the hardware. Instead, we recommend that you use smaller instances (Large or XLarge).

Examples

  • To monitor up to 7,500 host units with a peak load of 300,000 user actions per minute, you need 3 extra large (XLarge) nodes with a storage of 9 TB each split respectively to storage types.

  • To monitor 500 host units with a peak load of 25,000 user actions per minute, you need 3 small nodes with 1.3 TB storage each split respectively to storage types. Alternatively, you can also use 1 medium node with a storage of 3.1 TB.
    We recommend a failover set up of minimum 3 nodes instead of single nodes that are less resilient.

Dynatrace Managed Premium High Availability

Node Type

Max host units
monitored
(per node)

Peak user
actions/min
(per node)

Min node
specifications

Disk IOPS
(per node)

Transaction Storage
(10 days code visibility)

Long-term
Metrics Store

(per node)

Elasticsearch
(per node)
(35 days retention)

Large

600

25000

32 vCPUs,
256 GB RAM

7500

1 TB

2 TB

1.5 TB

XLarge1

1250

50000

64 vCPUs,
512 GB RAM

10000

2 TB

4 TB

3 TB

1

While Dynatrace Managed runs resiliently on instances with 1 TB+ RAM/128 cores (2XLarge) and allows you to monitor more entities, it's not the optimal way of utilizing the hardware. Instead, we recommend that you use smaller instances (Large or XLarge).

Example

To monitor 7,500 host units with a peak load of 300,000 user actions per minute in the Premium High Availability deployment, you need 6 extra large (XLarge) nodes - 3 nodes in one data center and 3 nodes in second data center, each node with a storage of 9 TB split respectively to storage types.

Storage recommendations

Dynatrace Managed stores multiple types of monitoring data, depending on the use case.

We recommend:

  • Storing Dynatrace binaries and the data store on separate mount points to allow the data store to be resized independently.
  • Not keeping Dynatrace data storage on the root volume to avoid additional complexity when resizing the disk later, if required.
  • Excluding data storage paths from antivirus scanning to prevent a break in data consistency.
  • Mounting different types of data storage on separate disk volumes for maximum flexibility and performance.
  • Creating resizable disk partitions (for example, by leveraging Logical Volume Manager [LVM]).
  • Making the same partition size on all cluster nodes.

Disk size can deviate to any extent depending on the usage.

For example, in a cluster that contains two nodes with 10 TB, where the transaction storage contribution on both nodes is only 1.5 TB, the additional node should be a minimum of 1 TB. (1.5 TB + 1.5 TB) / 3 = 1 TB. Node 1 + Node 2 divided by all 3 nodes. In a similar cluster deployment, where disks are 9 TB full, the additional node should be a minimum of 6 TB. (9TB + 9TB) / 3 = 6 TB.

The minimum disk size for the additional node differs from 1 TB to 6 TB. Anything below these minimum disk sizes would be considered misconfiguration. Keep in mind that, this deviation depends on disk usage for all contributors. Multiple contributors affect disk usage for example, Session Replay can also trigger misconfiguration based on its disk usage.

OneAgent opt-out

OneAgent self-monitoring is enabled by default. An opt-out installation parameter is available:

--install-agent <on|off>

Supported file systems

Dynatrace Managed operates on all common file systems. We recommend that you select fast local storage appropriate for database workloads. High latency remote volumes like NFS or CIFS aren't recommended. While NFS file systems are sufficient for backup purposes, we don't recommend them for primary storage.

Amazon Elastic File System

We don't support or recommend Amazon Elastic File System (EFS) as a main storage for Elasticsearch. Such file systems don't offer the behavior that Elasticsearch requires, and this may lead to index corruption.

Log Monitoring Classic recommendations

Requirement for Log Monitoring Classic:

  • All cluster nodes should have at least 64 GB RAM total host memory.

Additional recommendations for installation:

  • For a more robust configuration, it's better to add more cluster nodes than to increase hardware on each node.
  • Distribute additional Elasticsearch storage equally across cluster nodes.
  • Add CPUs and RAM to existing cluster nodes such that nodes remain equally sized.
  • Update the cluster ingest limit based on available resources for the cluster via API call or Cluster Management Console whenever cluster hardware changes.
    To adjust the ingest limit, in Cluster Management Console go to Environments, select your environment and adjust the limit in the Cluster overload prevention settings section.
  • Contact a Dynatrace product expert via live chat with additional questions regarding hardware recommendations.
  • These recommendations are in addition to any requirements from other traffic sources.
  • Log events are stored in the Elasticsearch storage.
  • Log events are stored in 2 copies.
  • Keep in mind that the retention time for log events is 35 days.

Multi-node installations

We recommend multi-node setups for failover and data redundancy. A sufficiently sized 3-node cluster is the recommended setup. For Dynatrace Managed installations with more than one node, all nodes must:

  • Have the same hardware configuration
  • Be synchronized with NTP
  • Be in the same time zone
  • Be able to communicate over a private network on multiple ports
  • The latency between nodes should be around 10 ms or less
  • System users created for Dynatrace Managed must have the same UID:GID identifiers on all nodes
Avoid split-brain sync problems

While two node clusters are technically possible, we don't recommend it. Our storage systems are consensus-based and require majority for data consistency. That's why two node cluster is vulnerable to split-brain problem and should be treated as a temporary state when migrating to 3 or more nodes. Running two nodes may create availability or data inconsistencies from two separate data sets (single node clusters) that overlap and are not communicating and synchronizing their data with each other.