Adaptive Traffic Management manages the sampling rate dynamically and targets a specific trace data volume. This volume scales according to the amount of Full-Stack GiB-hours of memory connected to your environment.
Dynatrace Full-Stack Monitoring packages a variety of features, including fully automatic distributed tracing. Each monitored application or microservice is constantly monitored and the Dynatrace code module collects distributed traces, containing code-level and business insights, that are sent to Dynatrace.
Full-Stack Monitoring includes a defined amount of trace data volume. This volume depends on the amount of Full-Stack GiB-hours of memory connected to Dynatrace. Every contributing gibibyte of host or application memory adds a certain amount of trace volume ingest rate to your environment.
In a dynamic cloud environment this can change all the time. Adaptive Traffic Management automatically adjusts the sampling rate of trace data collection so that the collected trace data doesn't exceed the included trace volume in 15-minute intervals. This way Dynatrace guarantees that no overage of trace data ingest is produced without your explicit consent.
In many cases this results in a trace data capture rate of 100% of all possible traces. However, depending on the monitored applications and the configuration of certain features in your environment, your capture rate might be lower.
Sampling rate distribution
Most sampling mechanisms manage traffic on the agent in an isolated and uncoordinated manner. In Dynatrace, the trace volume is dynamically shared between all monitored applications in the environment. In a sense, low-volume applications share their unused trace volume with high-volume applications that need it.
Sampling rate scenarios
In static sampling systems, you configure a fixed sampling rate for your deployment and distribute it across your deployment to apply different rates to different scenarios. Adaptive Traffic Management automatically efficiently captures requests; you can adjust the default logic and configure the capturing rate for specific requests via URL-based sampling.
Costs associated with captured data
In static sampling systems, the amount of captured data depends on the amount of transactions executed in your system, which is indetermined, therefore the associated costs are often hard to predict. With Adaptive Traffic Management, the cost is determined by your license and capturing scales with it.
We recognize that the distribution of requests and their relevance to your observability goals is not even. It's rather a combination of: a large number of unique URLs, a medium number of important requests, and, finally, a few kinds of requests that make up the majority of the traffic (for example, image requests or status checks).
OneAgent first calculates a list of top requests starting each minute, from which it then captures:
The trace volume is dynamically shared on the environment level between all monitored applications. In a sense, low-volume applications share their unused trace volume with high-volume applications that need it. This ensures that you use your trace volume most effectively. Because the sampling is not random, important data is captured while maintaining a statistically valid sample set.
The following table represents a top-request calculation example, along with the respective capture rates.
Request | Number of requests processed by the application (per minute) | Capture factor | Captured end-to-end distributed traces (per minute) |
---|---|---|---|
URI A | 900 | 1/2 | 450 |
URI B | 440 | 1/2 | 220 |
URI C | 250 | 1 | 250 |
URI D | 60 | 1 | 60 |
…50 other URIs | 100 | 1 | 100 |
Total: | 1500 | 1080 |
In this example, OneAgent can capture a bit more than 1,000 requests per minute, according to the amount of Full-Stack GiB-hours of memory connected to the environment. Adaptive Traffic Management adjusts and communicates to OneAgent the capture rate for each URI depending on:
OneAgent continues to capture end-to-end transactions every minute, however, every 15 minutes,
The Full-Stack included trace volume is measured in bytes per minute and is calculated based on the number of gibibytes that contribute to your environment's GiB-hour.
Each environment can process a minimum trace volume. For each contributing gibibyte, the environment peak trace volume is increased by a number of kibibytes per minute.
Every 15 minutes, the peak trace volume is calculated and automatically adjusted based on the average of contributing gibibytes in the previous 15-minute interval.
With Dynatrace Platform Subscription (DPS), all features (especially data-heavy ones like bind variables capture) are available.
To learn more about the trace included volume, see Full-Stack Monitoring.
If OneAgent is sampling and not all requests are captured, then captured traces point out that similar requests have not been captured with the message [number of traces] x
. You can see it by expanding the trace in the Distributed Tracing list.
To monitor your environment trace capture rate and volume ingress, go to Dashboards and select the ready-made dashboard Full-Stack Adaptive Traffic Management and trace capture.
Tile | Description |
---|---|
Request capture rate | Captured requests, as a percentage of the total number of transactions processed by OneAgent monitored application or host. |
Trace capture rate | Captured traces, as a percentage of the total number of observed end-to-end transactions processed by OneAgent monitored application or host. Note that the trace capture rate might be lower than the request capture rate because a single trace might consist of multiple requests. |
Full stack trace data volume | Amount of trace data ingested from Full-stack monitored applications or hosts. The chart includes
|
Full-Stack trace volume used | Ingested trace volume, as a percentage of your licensed Full-Stack included trace volume. Adaptive Traffic management keeps it around the Full-Stack included limit. Dynatrace's algorithm accounts for a degree of fluctuation and the used trace volume can be above 100% without extra charges, unless you opted for Extended trace ingest on top of Full-Stack Monitoring. |
Average size of Full-Stack spans | Average size of spans ingested from Full-stack monitored applications or hosts. Typical values are in the 1.5-2 KiB range; if the span size is larger and the used trace volume is high (or the trace capture rat is low), you might be capturing a lot of data per span. |
Adaptive trace volume per contributing memory-gibibytes per minute | Average trace volume every 15 minutes ( |
Full-Stack trace ingest and billable extended ingest | The reletionship between the amount of ingested trace data ( If you opted for Extended trace ingest on top of Full-Stack Monitoring,
|
Usually not at all.
The shaping of traffic is accounted for transparently and done in a way that ensures statistical validity while capturing rare requests with high probability. All charts show the total number of requests that your application processes that should be accurate or have a very high statistical validity. The same is true for all ad-hoc analyses. You will not see a difference in charts or service call analysis data unless you're looking at a single distributed trace.
No, Adaptive Traffic Management focuses only on the number of traces. Neither service settings nor (global) request settings are modified by Adaptive Traffic Management. Depending on the capture rate and sampling, a low-volume or unique request might not be captured. Service settings such as request naming rules and key request settings will apply only to captured traces.
Yes, in a few cases, as service monitoring metrics are based on captured traces. The following are some known effects.
If the OneAgent capture rate is below 100%, sampling has been applied because the amount of traces that can be captured by OneAgent has exceeded the Full-Stack included trace volume. There are several things you can do to increase the capture rate:
Verify what is currently being captured and reduce the rate for traces and requests of lower relevance. Start by looking at the following:
Excessive custom services
Custom services with poor configurations can lead to a high number of full-service calls and increase the trace ingress volume. If custom services are consuming a considerable amount of the trace volume, revisit the configuration to reduce the amount of capture custom-service calls.
Background activity services
In certain environments, background activity produces a lot of service calls but adds little value. To disable this feature on an environment-wide level, go to Settings > OneAgent features and turn off BBackground Requests for Services (HTTP/GRPC) and Background Requests for Services (Messaging).
High number of low-value traces
In all environments, there are transactions for which traces are of lower value. You can exclude from capture:
Data-heavy features
Data-heavy features can reduce the capture rate. You can
Extend the trace ingest as a billed option on top of Full-Stack availability. To learn how to, see Extended trace ingest on top of Full-Stack Monitoring.