Dynatrace Full-Stack Monitoring brings value with a variety of features, which include distributed tracing for applications via the patented PurePath® technology. Each monitored application or microservice is constantly monitored and produces distributed traces, containing code-level and business insights, that are sent to Dynatrace.
Depending on the number of application transactions, OneAgent captures end-to-end traces every minute up to a peak trace volume, which is defined per environment by your license. When the volume of transactions is high, the amount of traces that can be captured by OneAgent might exceed the peak trace volume available in your environment. When this happens, OneAgent starts sampling new incoming traces that have a trace root span in the most effective way possible, via the intelligent mechanism of Adaptive traffic management, stopping overages and consequentially saving a lot of network bandwidth.
The resulting capture rate is defined as the OneAgent capture rate. While not all possible traces might be captured, any trace that is captured represents a full end-to-end transaction.
In typical applications, the distribution of requests is not even. It's rather a combination of: a large number of unique URLs, a medium number of important requests, and, finally, a few kinds of requests that make up the majority of the traffic (for example, image requests or status checks).
With Adaptive traffic management, OneAgent first calculates a list of top requests starting each minute, from which it then captures:
Because the sampling is not random, all important data is captured while maintaining a statistically valid sample set.
The following table represents a top-request calculation example, along with the respective capture rates.
Request
Number of requests processed by the application
Capture factor
Captured distributed traces
URI A
900
1/2
450
URI B
440
1/2
220
URI C
250
1
250
URI D
60
1
60
…50 other URIs
100
1
100
Total:
1500
1080
In this example, a bit more than 1,000 requests/min are captured by OneAgent, accordingly to the configured target number of request. Depending on the capture factor, URIs are captured each time (URIs C, D, and 50 other URIs) or only 50% of the time (URIs A and B). In this last case, requests are traced end-to-end by OneAgent over 600 times/minute.
You can see the effect of Adaptive traffic management in the distributed trace list. If OneAgent is sampling and not all requests are captured, then captured traces will point out that similar requests have not been captured with the message [amount] more like this
in the distributed trace list.
In this way, OneAgent reduces the data sent to your environment, ensuring that the amount of captured traces stays within the limits of your Dynatrace agreement.
Most sampling mechanisms manage traffic on the agent in an isolated and uncoordinated manner. Dynatrace on the other hand manages the peak trace volume on the environment level and thus the available volume is dynamically shared between all monitored applications. In a sense, low-volume applications share their unused trace volume with high-volume applications that need it. Note that the peak trace volume is available for all traces sent by OneAgent code modules or via OneAgent Trace API.
Dynatrace automatically manages the peak trace volume based on your license. To learn more about Adaptive traffic management, see either Adaptive traffic management with Dynatrace Platform Subscription (DPS) or Adaptive traffic management with classic licensing) below.
If your environment is using the earlier version of DPS documentation, refer to Adaptive traffic management with classic licensing.
In Adaptive traffic management with latest version of Dynatrace Platform Subscription (DPS), the peak trace volume is measured in Byte/minute and is calculated based on the number of Gibibytes that contribute to your environment's GiB-hour.
Each environment can process a minimum trace volume of 14 Mebibyte/min. For each contributing Gibibyte, the environment peak trace volume is increased by 45 Kibibyte/min. You can calculate the peak trace volume using the metric expression below.
dsfm:billing.fullstack.maximum_included_trace_volume_per_minute=(builtin:billing.full_stack_monitoring.usage:last:splitBy():last*4)*(45*1024)
Every 15 minutes, the peak trace volume is calculated and automatically adjusted based on the average contributing Gibibyte over the prior 15 minutes.
With Dynatrace Platform Subscription (DPS) all features, especially data-heavy ones like bind variables capture, are available.
To monitor your environment trace capture rate and volume ingress, use the preset dashboard for Adaptive traffic management with Dynatrace Platform Subscription (DPS).
To open the dashboard, go to Dashboards or Dashboards Classic (latest Dynatrace) and select the OneAgent Traces - Adaptive traffic management dashboard.
DPS License
Tile | Description |
---|---|
Dynatrace process rate | Percent of full-service calls processed by OneAgent over all full-service calls received by the environment. It represents the environment's health. If the value is continuously below 90%, Please contact a Dynatrace product expert via live chat within your environment. |
OneAgent capture rate | Percent of traces captured end-to-end by OneAgent over all traces received by the environment. Values below 100% indicate sampling was applied to reduce trace volume down to within licensed limits. |
Processed and received full-service calls | Indicates the amount of received full-service calls (blue), the peak trace volume (red), and the overall potentially traceable service calls (green) processed by the OneAgent over time. |
Size of full-service calls | Amount of data per service call 1. Typical values are around 2-3 Kibibytes per environment. Excessive usage of data-heavy features like bind variables can increase it and lead to a larger overall trace volume 2. |
Trace Ingress/contributing GiB | Amount of trace volume per contributing Gibibyte. Typical values are around 45 Kibibytes. |
Trace volume ingress | Indicates the captured trace volume 2 (green) and the licensed volume (red) over time 3. |
Trace volume used |
Service call data includes request attributes, span attributes, HTTP headers, and bind variables.
The trace volume is roughly equal to [amount] full-service calls * Size of full-service calls
.
Values above the licensed limit indicate overages, for which sampling is triggered. After sampling is applied, values return to within licensed limits and the OneAgent capture rate will be below 100%.
In Adaptive traffic management with classic licensing there are two versions, Version 2 and Version 3. Depending on the version, the peak trace volume is measured either in Full-service call/minute or Byte/minute.
For both versions, the environment peak trace volume is calculated based on the active host units in your environment. Each environment can process a minimum peak trace volume of approximately 20 host units. For each active host unit, the environment peak trace volume is increased by a version-specific fixed value. Every 15 minutes, the peak trace volume is calculated and automatically adjusted based on the current number of active host units.
Specifications | Version 2 | Version 3 |
---|---|---|
Unit of measurement | Full-service call/minute | Byte/minute |
Min. trace volume | 5000 Full-service call/min | 14 Mebibyte/min |
Peak trace volume |
|
|
Metric expression |
|
|
Data-heavy features | 🟡 Partial availability | 🟢 All available |
The trace volume that your environment can process is similar in both versions of Adaptive traffic management with classic licensing. A full-service call typically needs 2–3 Kibibytes in trace volume. For example, a moderate environment of 50 hosts with 32 GB each (= 100 host units) can process up to 25,000 full-service calls per minute in Version 2 (around 49–73 Mebibytes). The same environment can process up to 70.3 Mebibytes of traces per minute in Version 3.
To monitor your environment trace capture rate and volume ingress, based on your version of the Adaptive traffic management with classic licensing, you can use the preset dashboard.
To open the dashboard, go to Dashboards or Dashboards Classic (latest Dynatrace) and select the OneAgent Traces - Adaptive traffic management dashboard.
Classic licensing
Tile | Description |
---|---|
Dynatrace process rate | Percent of full-service calls processed by OneAgent over all full-service calls received by the environment. It represents the environment's health. If the value is continuously below 90%, Please contact a Dynatrace product expert via live chat within your environment. |
OneAgent capture rate | Percent of traces captured end-to-end by OneAgent over all traces received by the environment. Values below 100% indicate sampling was applied to reduce trace volume down to within licensed limits. |
Processed and received full-service calls | Indicates the amount of received full-service calls (blue), the peak trace volume (red), and the overall potentially traceable service calls (green) processed by the OneAgent over time. |
Size of full-service calls | Amount of data per service call 1. Typical values are around 2-3 Kibibytes per environment. Excessive usage of data-heavy features like bind variables can increase it and lead to a larger overall trace volume 2. |
Trace Ingress/in use HU | Amount of trace volume per active host unit. Typical values are around 720 Kibibytes. |
Trace volume ingress | Indicates the captured trace volume 2 (green) and the licensed volume (red) over time 3. |
Trace volume used |
Service call data includes request attributes, span attributes, HTTP headers, and bind variables.
The trace volume is roughly equal to [amount] full-service calls * Size of full-service calls
.
Values above the licensed limit indicate overages, for which sampling is triggered. After sampling is applied, values return to within licensed limits and the OneAgent capture rate will be below 100%.
Tile | Description |
---|---|
Dynatrace process rate | Percent of full-service calls processed by OneAgent over all full-service calls received by the environment. It represents the environment's health. If the value is continuously below 90%, Please contact a Dynatrace product expert via live chat within your environment. |
OneAgent capture rate | Percent of traces captured end-to-end by OneAgent over all traces received by the environment. Values below 100% indicate sampling was applied to reduce trace volume down to within licensed limits. |
Processed and received full-service calls | Indicates the amount of received full-service calls (blue), the peak trace volume (red), and the overall potentially traceable service calls (green) processed by the OneAgent over time 1. |
Size of full-service calls | Amount of data per service call 2 over time. Typical values are around 2-3 Kibibytes per environment. |
FSC/HU | Number of full-service calls per active host unit. Typical values are around 250. |
Values above the licensed limit indicate overages, for which sampling is triggered. After sampling is applied, values return to within licensed limits and the OneAgent capture rate will be below 100%.
Service call data includes request attributes, span attributes, HTTP headers, and bind variables.
The short answer is, not at all.
The shaping of traffic is accounted for transparently and done in a way that ensures statistical validity while capturing rare requests with high probability. All charts show the total real number of requests that your application processes, as does all ad-hoc analysis you might perform. You will not see a difference in charts or service call analysis data unless you're looking at a single distributed trace. Indeed, the only place where this traffic shaping is visible is in the distributed traces list, which displays a message like [number of traces] more like this
.
No, adaptive traffic management focuses only on the number of traces. Neither service settings nor (global) request settings are modified by adaptive traffic management. Depending on the capture rate and sampling, a low-volume or unique request might not be captured. Service settings such as request naming rules and key request settings will apply only to captured traces.
Yes, in a few cases, as service monitoring metrics are based on captured traces. The following are some known effects.
If OneAgent capture rate is below 100%, sampling has been applied because the amount of traces that can be captured by OneAgent has exceeded the licensed limit. You can increase the capture rate by not capturing traces and service calls of lower relevance. Start by looking at the following:
Excessive custom services
Custom services with poor configurations can lead to a high number of full-service calls and increase the trace ingress volume. If custom services are consuming a considerable amount of the trace volume, revisit the configuration to reduce the amount of capture custom-service calls.
Background activity services
In certain environments, background activity produces a lot of service calls but adds little value. Currently, you can only disable this feature completely and not case by case. For more information, contact a Dynatrace product expert. Please contact a Dynatrace product expert via live chat within your environment.
Many very small service calls in Version 2 (classic licensing)
If there are many small (below 2.5 KiB) service calls in your environment, we recommend that you switch from Version 2 to Version 3. Version 3 focuses on the trace volume in terms of bytes instead of the number of service calls, leading to a higher capture rate.
High number of low-value traces
In all environments, there are transactions for which traces are of lower value. You can exclude from capture:
Unused assigned host units
See Dynatrace is not using all available host units in my classic licensing. What can I do?
You can learn which version of Adaptive traffic management is active in your environment by looking at the preset dashboard. If your environment is using
Classic licensing
In both versions of Adaptive traffic management with classic licensing, the calculation of peak trace volume is based on the currently active host units by default. To use all the host units assigned to the environment, contact a Dynatrace product expert via live chat and provide a rationale.
This is an environment-wide change, so you need Dynatrace administrator permissions to turn this feature on.
Classic licensing Dynatrace version 1.264+
Please contact a Dynatrace product expert via live chat within your environment.
This is an environment-wide change, so you need Dynatrace administrator permissions to turn on this feature.
Server side call that starts: a distributed trace, a service call at a deep monitored tier, or a custom service call. A single distributed trace can contain multiple full-service calls.
Full-service call | |
---|---|
All requests for web request services and web services (except for external ones), RMI services, messaging services and custom services are full-service calls. | |
External calls (such as database calls, external web requests, or generally any opaque service call) are not full-service calls, and so aren't counted against your traffic limit. |
The minimum number of full-service calls per minute in a given environment is 5,000 (the equivalent of 20 host units). Each process can start between 50 and 50,000 full-service calls per minute.
Host units currently in use and connected to the environment (not the host units assigned to the environment).