Adaptive Traffic Management manages the sampling rate dynamically and targets a specific peak trace data volume. This volume scales with the number of active host units in your Dynatrace Full-Stack Classic license.
A server-side call that starts a distributed trace, a service call at a deep monitored tier, or a custom service call. A single distributed trace can contain multiple full-service calls. Full-service calls include all requests for web request services and web services (except for external ones), RMI services, messaging services, and custom services. External requests (such as database calls, external web requests, or generally any opaque service call) are not full-service calls, and so aren't counted against your traffic limit. The minimum number of full-service calls per minute in a given environment is 5,000 (the equivalent of 20 host units). Each process can start between 50 and 50,000 full-service calls per minute.
Host units currently in use and connected to the environment (not the host units assigned to the environment).
Host units currently assigned and connected to the environment, but not necessarily in use.
Dynatrace Full-Stack Monitoring packages a variety of features, including fully automatic distributed tracing. Each monitored application or microservice is constantly monitored and the Dynatrace code module collects distributed traces, containing code-level and business insights, that are sent to Dynatrace.
Full-Stack Monitoring includes a trace data volume. Depending on the number of application transactions, OneAgent captures end-to-end traces up to a peak trace volume, which is defined per environment by your license. When the volume of transactions is high, the amount of traces that can be captured by OneAgent might exceed the peak trace volume available in your environment, or in other words, there are not enough active host units connected to your environment to capture all traces.
When this happens, OneAgent starts sampling new incoming traces that have a trace root span. It samples incoming traces in the most effective way possible, via the intelligent mechanism of Adaptive Traffic Management, stopping overages and consequentially saving a lot of network bandwidth.
The resulting capture rate is defined as the OneAgent capture rate. While not all possible traces might be captured, any trace that is captured represents a full end-to-end transaction.
Sampling rate distribution
Most sampling mechanisms manage traffic on the agent in an isolated and uncoordinated manner. In Dynatrace, the trace volume is dynamically shared between all monitored applications in the environment. In a sense, low-volume applications share their unused trace volume with high-volume applications that need it.
Sampling rate scenarios
In static sampling systems, you configure a fixed sampling rate for your deployment and distribute it across your deployment to apply different rates to different scenarios. Adaptive Traffic Management automatically efficiently captures requests; you can adjust the default logic and configure the capturing rate for specific requests via URL-based sampling.
Costs associated with captured data
In static sampling systems, the amount of captured data depends on the amount of transactions executed in your system, which cannot be determined in advance. Therefore, the associated costs are often hard to predict. With Adaptive Traffic Management, the cost is determined by your license and capturing scales with it.
We recognize that the distribution of requests and their relevance to your observability goals is not even. It's rather a combination of: a large number of unique URLs, a medium number of important requests, and, finally, a few kinds of requests that make up the majority of the traffic (for example, image requests or status checks).
OneAgent first calculates a list of top requests starting each minute, from which it then captures:
The trace volume is dynamically shared on the environment level between all monitored applications. In a sense, low-volume applications share their unused trace volume with high-volume applications that need it. This ensures that you use your trace volume most effectively. Because the sampling is not random, important data is captured while maintaining a statistically valid sample set.
The following table represents a top-request calculation example, along with the respective capture rates.
Request
Number of requests processed by the application (per minute)
Capture factor
Captured distributed traces (per minute)
URI A
900
1/2
450
URI B
440
1/2
220
URI C
250
1
250
URI D
60
1
60
…50 other URIs
100
1
100
Total:
1500
1080
In this example, OneAgent can capture a bit more than 1,000 requests per minute, according to the amount of active host units connected to the license. Adaptive Traffic Management adjusts the capture rate for each URI to meet the target. Depending on the capture factor, URIs are captured each time the application processes them (URIs C, D, and 50 other URIs) or half of the time (URIs A and B). In both cases the requests are captured end-to-end.
The peak trace volume is measured in full-service calls per minute and is calculated based on the number of active host units in your environment.
Each environment processes a minimum trace volume. For each active host unit, the environment peak trace volume is increased by a fixed value.
Every 15 minutes, the peak trace volume is calculated and automatically adjusted based on the current number of active host units.
To learn more about the trace included volume of Full-Stack, see Application and Infrastructure Monitoring.
If OneAgent is sampling and not all requests are captured, then captured traces point out that similar requests have not been captured with the message [number of traces] similar trace
. You can see it by expanding the trace in Distributed Traces Classic list.
To monitor your environment trace capture rate and volume ingress, go to Dashboards or Dashboards Classic (latest Dynatrace) and select the OneAgent Traces - Adaptive traffic management (Classic License) dashboard.
Tile | Description |
---|---|
Dynatrace process rate | Percent of full-service calls processed by OneAgent over all full-service calls received by the environment. It represents the environment's health. If the value is continuously below 90%, Please contact a Dynatrace product expert via live chat within your environment. |
OneAgent capture rate | Percent of traces captured end-to-end by OneAgent over all traces received by the environment. Values below 100% indicate sampling was applied to reduce trace volume down to within licensed limits. |
Captured full service calls | Indicates the amount of received full-service calls (blue), the peak trace volume (red), and the overall potentially traceable service calls (green) processed by the OneAgent over time1. |
Size of full-service calls | Amount of data per service call2 over time. Typical values are around 2-3 KiB per environment. |
FSC/HU | Number of full-service calls per active host unit. Typical values are around 250. |
Values above the licensed limit indicate overages, for which sampling is triggered. After sampling is applied, values return to within licensed limits and the OneAgent capture rate will be below 100%.
Service call data includes request attributes, span attributes, HTTP headers, and bind variables.
Usually not at all.
The shaping of traffic is accounted for transparently and done in a way that ensures statistical validity while capturing rare requests with high probability. All charts show the total number of requests that your application processes that should be accurate or have a very high statistical validity. The same is true for all ad-hoc analyses. You will not see a difference in charts or service call analysis data unless you're looking at a single distributed trace.
No, Adaptive Traffic Management focuses only on the number of traces. Neither service settings nor (global) request settings are modified by Adaptive Traffic Management. Depending on the capture rate and sampling, a low-volume or unique request might not be captured. Service settings such as request naming rules and key request settings will apply only to captured traces.
Yes, in a few cases, as service monitoring metrics are based on captured traces. The following are some known effects.
In Adaptive Traffic Management with Classic license, the calculation of peak trace volume is based on the currently active host units. To use all the host units that are assigned to the environment, contact a Dynatrace product expert via live chat and provide a rationale.
This is an environment-wide change, so you need Dynatrace administrator permissions to turn this feature on.
If OneAgent capture rate is below 100%, sampling has been applied because the amount of traces that can be captured by OneAgent has exceeded the Full-Stack included trace volume. There are several things you can do to increase the capture rate:
Verify what is currently being captured and reduce the rate for traces and requests of lower relevance. Start by looking at the following:
Excessive custom services
Custom services with poor configurations can lead to a high number of full-service calls and increase the trace ingress volume. If custom services are consuming a considerable amount of the trace volume, revisit the configuration to reduce the amount of capture custom-service calls.
Background activity services
In certain environments, background activity produces a lot of service calls but adds little value. To disable this feature on an environment-wide level, go to Settings > OneAgent features and turn off BBackground Requests for Services (HTTP/GRPC) and Background Requests for Services (Messaging).
High number of low-value traces
In all environments, there are transactions for which traces are of lower value. You can exclude the following transactions from capture:
Specific transactions such as ping and health check traces. To disable tracing for specific traces, go to Settings > Server-side service monitoring > Deep monitoring.