Distributed Tracing

  • Latest Dynatrace
  • App
  • 3-min read
  • Published Jul 15, 2024

Prerequisites

Permissions

The following table describes the required permissions.

Permission
Description
storage:buckets:read
Read buckets data
storage:spans:read
Read span data
storage:entities:read
Read entities data
storage:logs:read
Read logs
state:user-app-states:read
Read user app state
state:user-app-states:write
Write user app state
state:user-app-states:delete
Delete user app state
storage:fieldsets:read
Read masked/sensitive fields
storage:filter-segments:read
Read filter-segments
storage:smartscape:read
Read smartscape nodes and edges

Installation

Make sure the app is installed in your environment.

Get started

Distributed Tracing powered by Grail helps you get the most out of your trace data in Dynatrace. It enables the ingestion and processing of petabytes of trace data, allowing you to monitor and troubleshoot errors and performance issues in complex distributed software systems at scale. Trace data follows the Dynatrace trace-span data model, so the analysis of all related information and attributes is intuitive and can be done via Distributed Tracing Distributed Tracing and Dynatrace Query Language (DQL). The trace data is stored in Grail, so you can leverage the power of Grail to analyze even unknown unknowns.

Distributed Tracing Distributed Tracing user-friendly interface is designed with engineers, SREs, and performance architects in mind, making it easy to visually analyze your trace data right away.

Quickly understand response times and errors using dynamic visualization tools like histogramsDive into the details of a trace and explore related logsEasily track all exceptions within a span with clear relationships and exception chains showing root causeSurface and analyze exceptions across traces with readable stack traces, aggregated insights, and visual markers highlighting problematic spans
1 of 4Quickly understand response times and errors using dynamic visualization tools like histograms

Learning modules

Go through the following process to learn using Distributed Tracing:

Concepts

Distributed trace

A distributed trace is a sequence of spans, identified by a unique trace ID, that follows the path of a single request as it traverses through various services and components in a distributed system. In a modern microservice environment, it typically spans multiple services, providing a detailed view of the request's journey and performance. The trace contains semantically different attributes that make it possible to interpret and understand the collected data, helping identify bottlenecks, errors, and latency issues for efficient troubleshooting and optimization.

Traces and services

Use cases

  • Understand how requests propagate across distributed systems and microservices.
  • Use high-quality data generated by distributed systems and microservices for request analysis.
  • Quickly understand how each microservice is performing.
  • Follow Dynatrace Intelligence root cause analysis drill-downs to identify cause-effect relationships between events.

Span

A span represents a single operation within a distributed trace, capturing the details of the request's journey through multiple services. Each span includes attributes such as the name, the start timestamp, a list of span events (such as exceptions), the parent's span identifier, and the span kind. This information—span context— helps to put all spans and events in context with each other, so that you can trace and understand the performance and behavior of individual operations within the distributed system.

Within a trace, when the activity—parent span—is completed, the next activity passes to its child span. A span without a parent span is called a trace root span and indicates the start of a trace.

The image below shows a trace traversing three services and producing a request for each service. Each request has a root span, one of which is also the root of the trace.

  • A is both the first span of the trace and the first span of the request within the first service; additionally, A is a span without a parent. A is both the root of the trace and of the request.
  • C is the first span of the request within the second service; additionally, C has a parent (B). C is the root of the second request.
  • E is not the first span of the request within the third service. E's parent (D) is the root of the third request.

Trace anatomy

Learn more about span semantic fields.

The span context allows a child span to relate to the trace and its parent span. Therefore, the context needs to be propagated within a service (across different threads) but also across services and process boundaries. This typically happens via HTTP headers (like the W3C trace context) or via unique IDs in messaging systems. To learn more about context propagation, see Span and trace context propagation.

Attribute

Attributes are key-value pairs that provide details about a span, request, or resource such as response codes, HTTP methods, and URLs. Via attributes, you can group, query, find, and analyze your traces and spans.

Use cases

Dynatrace uses attribute metadata to

  • Detect and name services.
  • Gather data on the trace context and relationships with other entities for Smartscape topology.
  • Connect log data to traces for Logs or Logs Classic.
  • Understand how the duration of a span is affected by service timings (for example, CPU time, network time, or just waiting for other threads) and analyze which code was executed in the context of the span.

Best practices

If you collect trace data via

Learn more about request attributes and captured attributes semantic fields.

Service

Services are traversed by distributed traces. On horizontally scaled services, specific Service Instances process each span. Services are determined and named based on available attributes or properties that are collected along with the spans.

Use cases

Data collection and context propagation

You can integrate OpenTelemetry and OneAgent to collect trace data—like request status, response time, versions, infrastructure information, and other relevant metadata as attributes. The trace context, including the unique trace ID, is then propagated across your apps and microservices.

Best practices

Before getting started with distributed tracing, understand how setup and trace data collection differs between OpenTelemetry and OneAgent. The following is an overview of the key differences.

OpenTelemetry

OneAgent

Set up

Automatic or manual

Automatic

Capturing

Automatic collection of allowed span attributes.

Automatic collection of several request attributes, including HTTP method, URL, response codes, topology data, and details about the underlying technologies.

Context

Automatically or manually contextualized log entries, depending on the instrumentation library.

Automatically contextualized

  • Log entries produced by prominent log frameworks.
  • Traces in Smartscape and Dynatrace Intelligence.

To get started see

Use OpenTelemetry in combination with OneAgent to enhance your observability coverage, using the best of both.

Use cases

Use Distributed Tracing Distributed Tracing for:

  • Troubleshooting: Find out why requests fail and prevent future issues.

  • Performance optimization: Understand system performance and identify bottlenecks to improve reliability and user experience.

  • Detailed analysis: Look into individual trace details for deeper insights.

  • Exploratory analysis: Use free-form analysis to discover and explore unknown unknowns on the fly.

  • Discovering unknown unknowns: Be prepared for the unknown by using free-form analysis to explore and dissect data on the fly.

  • Synthesizing traces and monitoring signals: View trace data in context with other signal like logs, business events, or metrics.

FAQ

Why do I see the message "The new tracing experience is coming to your environment soon!" when starting the Distributed Tracing app?

The new tracing experience is rolled out in stages to extend access to Dynatrace SaaS DPS customers until March 2025. The availability timing depends on the geographic region and the overall trace volume of your account. For more information, reach out to your Customer Success Manager.

Are traces in Distributed Traces Classic views still available?

Yes. Once you start analyzing traces in the Distributed Tracing app, you can continue to use Distributed Traces Classic side by side.

Which licensing package covers Distributed Tracing?

DPS FullStack and/or Custom Traces Classic. No additional costs apply when you're using the new Distributed Tracing app.

I modified facets. Can I reset them to the recommended way?

Yes. Go to Distributed Tracing Distributed Tracing and select Show facets > Reset to default.

How can I remove span.kind links with null duration?

Update to the latest version of OneAgent.

Not all traces available in Distributed Traces Classic are visible in the Distributed Tracing app.
  • Make sure you have the latest version of OneAgent.
  • Make sure the OneAgent feature Forward Tag 4 trace context extension is on; this ensures OneAgent-captured traces are compatible with the W3C trace context standard.
I see incomplete end-to-end traces in the Distributed Tracing app that are shown complete in Distributed Traces Classic.
  • Make sure you have the latest version of OneAgent.
  • Make sure the OneAgent feature Forward Tag 4 trace context extension is on; this ensures OneAgent-captured traces are compatible with the W3C trace context standard.
How can I filter for traces that are collected by OneAgent or ingested via OpenTelemetry?
  1. Go to Distributed Tracing Distributed Tracing.
  2. In the facet list, enter span source and select the source you're interested in.
Does the new distributed tracing experience support cross-environment tracing?

Not yet.

Hub

Explore in Dynatrace Hub

Analyze and slice distributed traces by any attribute and from any source.

Dynatrace Hub

Related tags
Application ObservabilityDistributed TracingDistributed Tracing