Distributed Tracing

Latest Dynatrace
App
3-min read
Published Jul 15, 2024

Prerequisites

Dynatrace Platform Subscription (DPS)

Permissions

The following table describes the required permissions.

storage:buckets:read

Read buckets data

storage:spans:read

Read span data

storage:entities:read

Read entities data

storage:logs:read

Read logs

state:user-app-states:read

Read user app state

state:user-app-states:write

Write user app state

state:user-app-states:delete

Delete user app state

storage:fieldsets:read

Read masked/sensitive fields

storage:filter-segments:read

Read filter-segments

storage:smartscape:read

Read smartscape nodes and edges

Installation

Make sure the app is installed in your environment.

Get started

Distributed Tracing powered by Grail helps you get the most out of your trace data in Dynatrace. It enables the ingestion and processing of petabytes of trace data, allowing you to monitor and troubleshoot errors and performance issues in complex distributed software systems at scale. Trace data follows the Dynatrace trace-span data model, so the analysis of all related information and attributes is intuitive and can be done via Distributed Tracing Distributed Tracing and Dynatrace Query Language (DQL). The trace data is stored in Grail, so you can leverage the power of Grail to analyze even unknown unknowns.

Distributed Tracing Distributed Tracing user-friendly interface is designed with engineers, SREs, and performance architects in mind, making it easy to visually analyze your trace data right away.

Quickly understand response times and errors using dynamic visualization tools like histograms

Dive into the details of a trace and explore related logs

Easily track all exceptions within a span with clear relationships and exception chains showing root cause

Surface and analyze exceptions across traces with readable stack traces, aggregated insights, and visual markers highlighting problematic spans

1 of 4Quickly understand response times and errors using dynamic visualization tools like histograms

Learning modules

Go through the following process to learn using Distributed Tracing:

01Exception analysis

Tutorial
Exception analysis helps you detect, investigate, and resolve exceptions more effectively in Dynatrace.

02Ingest traces

How-to guide
Instrument your applications with OneAgent or OpenTelemetry to start ingesting trace data into Dynatrace.

03Set up Grail permissions for Distributed Tracing

How-to guide
Manage permissions for Distributed Tracing powered by Grail.

04Configure data storage and retention for Distributed Tracing

How-to guide
Manage data storage and retention for Distributed Tracing powered by Grail.

05Span and trace context propagation

Reference
Understand span and trace context propagation in Dynatrace and how to set them up.

06Use traces, DQL, and logs to spot patterns

Tutorial
Utilize traces, logs, and DQL to visualize raw data and identify abnormal patterns.

07Distributed Tracing app

Explanation
Discover the functionalities of the new Distributed Tracing app.

08Advanced Tracing Analytics powered by Grail

Tutorial
Explore advanced tracing analysis capabilities on Grail.

Concepts

Distributed trace

A distributed trace is a sequence of spans, identified by a unique trace ID, that follows the path of a single request as it traverses through various services and components in a distributed system. In a modern microservice environment, it typically spans multiple services, providing a detailed view of the request's journey and performance. The trace contains semantically different attributes that make it possible to interpret and understand the collected data, helping identify bottlenecks, errors, and latency issues for efficient troubleshooting and optimization.

Traces and services

Use cases

Understand how requests propagate across distributed systems and microservices.
Use high-quality data generated by distributed systems and microservices for request analysis.
Quickly understand how each microservice is performing.
Follow Dynatrace Intelligence root cause analysis drill-downs to identify cause-effect relationships between events.

Span

A span represents a single operation within a distributed trace, capturing the details of the request's journey through multiple services. Each span includes attributes such as the name, the start timestamp, a list of span events (such as exceptions), the parent's span identifier, and the span kind. This information—span context— helps to put all spans and events in context with each other, so that you can trace and understand the performance and behavior of individual operations within the distributed system.

Within a trace, when the activity—parent span—is completed, the next activity passes to its child span. A span without a parent span is called a trace root span and indicates the start of a trace.

The image below shows a trace traversing three services and producing a request for each service. Each request has a root span, one of which is also the root of the trace.

A is both the first span of the trace and the first span of the request within the first service; additionally, A is a span without a parent. A is both the root of the trace and of the request.
C is the first span of the request within the second service; additionally, C has a parent (B). C is the root of the second request.
E is not the first span of the request within the third service. E's parent (D) is the root of the third request.

Trace anatomy

Learn more about span semantic fields.

The span context allows a child span to relate to the trace and its parent span. Therefore, the context needs to be propagated within a service (across different threads) but also across services and process boundaries. This typically happens via HTTP headers (like the W3C trace context) or via unique IDs in messaging systems. To learn more about context propagation, see Span and trace context propagation.

Attribute

Attributes are key-value pairs that provide details about a span, request, or resource such as response codes, HTTP methods, and URLs. Via attributes, you can group, query, find, and analyze your traces and spans.

Use cases

Dynatrace uses attribute metadata to

Detect and name services.
Gather data on the trace context and relationships with other entities for Smartscape topology.
Connect log data to traces for Logs or Logs Classic.
Understand how the duration of a span is affected by service timings (for example, CPU time, network time, or just waiting for other threads) and analyze which code was executed in the context of the span.

Best practices

If you collect trace data via

OpenTelemetry, define captured attribute settings.
OneAgent, define requests attribute settings.

Learn more about request attributes and captured attributes semantic fields.

Service

Services are traversed by distributed traces. On horizontally scaled services, specific Service Instances process each span. Services are determined and named based on available attributes or properties that are collected along with the spans.

Use cases

Segment requests to improve response time degradation.

Data collection and context propagation

You can integrate OpenTelemetry and OneAgent to collect trace data—like request status, response time, versions, infrastructure information, and other relevant metadata as attributes. The trace context, including the unique trace ID, is then propagated across your apps and microservices.

Best practices

Before getting started with distributed tracing, understand how setup and trace data collection differs between OpenTelemetry and OneAgent. The following is an overview of the key differences.

To get started see

Use OpenTelemetry in combination with OneAgent to enhance your observability coverage, using the best of both.

Use cases

Use Distributed Tracing Distributed Tracing for:

Troubleshooting: Find out why requests fail and prevent future issues.
Performance optimization: Understand system performance and identify bottlenecks to improve reliability and user experience.
Detailed analysis: Look into individual trace details for deeper insights.
Exploratory analysis: Use free-form analysis to discover and explore unknown unknowns on the fly.
Discovering unknown unknowns: Be prepared for the unknown by using free-form analysis to explore and dissect data on the fly.
Synthesizing traces and monitoring signals: View trace data in context with other signal like logs, business events, or metrics.

FAQ

Why do I see the message "The new tracing experience is coming to your environment soon!" when starting the Distributed Tracing app?

The new tracing experience is rolled out in stages to extend access to Dynatrace SaaS DPS customers until March 2025. The availability timing depends on the geographic region and the overall trace volume of your account. For more information, reach out to your Customer Success Manager.

Are traces in Distributed Traces Classic views still available?

Yes. Once you start analyzing traces in the Distributed Tracing app, you can continue to use Distributed Traces Classic side by side.

Which licensing package covers Distributed Tracing?

DPS FullStack and/or Custom Traces Classic. No additional costs apply when you're using the new Distributed Tracing app.

Yes. Go to Distributed Tracing Distributed Tracing and select Show facets > Reset to default.

How can I remove span.kind links with null duration?

Update to the latest version of OneAgent.

Not all traces available in Distributed Traces Classic are visible in the Distributed Tracing app.

Make sure you have the latest version of OneAgent.
Make sure the OneAgent feature Forward Tag 4 trace context extension is on; this ensures OneAgent-captured traces are compatible with the W3C trace context standard.

I see incomplete end-to-end traces in the Distributed Tracing app that are shown complete in Distributed Traces Classic.

Make sure you have the latest version of OneAgent.
Make sure the OneAgent feature Forward Tag 4 trace context extension is on; this ensures OneAgent-captured traces are compatible with the W3C trace context standard.

How can I filter for traces that are collected by OneAgent or ingested via OpenTelemetry?

Go to Distributed Tracing.
In the facet list, enter span source and select the source you're interested in.

Does the new distributed tracing experience support cross-environment tracing?

Not yet.

Explore in Dynatrace Hub

Analyze and slice distributed traces by any attribute and from any source.

Dynatrace Hub

Distributed Tracing

Prerequisites

Permissions

Installation

Get started

Learning modules

Concepts

Distributed trace

Use cases

Span

Attribute

Use cases

Best practices

Service

Use cases

Data collection and context propagation

Best practices

Use cases

FAQ

Explore in Dynatrace Hub

Related topics