Reduce span-based and metric-based cardinality

Latest Dynatrace
Tutorial
10-min read

OpenPipeline processing allows you to normalize span and metric data to prevent high-cardinality issues that can make aggregations and analysis unusable.

Who is this for?

This article is intended for administrators and Services users.

What will you learn?

In this tutorial, you'll learn how to reduce cardinality in three different views in Services.

Limit cardinality of external calls on the Outbound calls tab.
Normalize Redis commands in the Database queries view.
Reduce cardinality of temporary queue names in the Message processing view.

Limit cardinality of external calls

The Outbound calls tab in Services displays aggregated metrics for external calls made by your service. High cardinality occurs when URLs contain unique identifiers in the path, such as /users/12345 or /orders/abc-def-123, resulting in many distinct values.

OpenPipeline processing rules help you transform these into normalized patterns, such as /users/* or /orders/*, optimizing your outbound call data.

1. Create a pipeline

First, let's create a pipeline that contains a processing rule for reducing the cardinality of your outbound calls.

To create a pipeline

Go to Settings > Process and contextualize > OpenPipeline > Spans.
Go to the Pipelines tab, and select Pipeline.
Enter Outbound call normalization as a pipeline name.

On the Processing tab, select Processor > DQL, and configure a new DQL processor:

Name: URL
Matching condition:
```
span.kind == "client" and isNotNull(url.full)
```
This condition matches all outbound HTTP calls.

DQL processor definition

fieldsAdd url.full.orig = url.full
 | fieldsAdd path_normalized = replacePattern(url.path, "UUIDSTRING", "[UUID]")
 | fieldsAdd path_normalized = replacePattern(path_normalized, "[/]LONG", "/[Number]")
 | fieldsAdd port = if(isNotNull(server.port), concat(":", server.port),   else:null)
 | fieldsAdd url.full = concat(url.scheme, "://", server.address, port, path_normalized)

Select Save in the upper-right corner of the page.

You should see the Outbound call normalization pipeline in the list of pipelines.

2. Create a dynamic route

Now that you have created a new pipeline and defined a cardinality-reducing DQL processor, activate this processor by connecting it to OpenPipeline via a dynamic route. This way, your pipeline can receive the span data.

To create a dynamic route

On the Spans page, go to the Dynamic routing tab.
Select Dynamic route.
Define your dynamic route:
- Name: Outbound call normalization
- Matching condition:
```
span.kind =="client" and isNotNull(url.full)
```
  This condition matches all outbound HTTP calls.
- Pipeline: Select the previously created Outbound call normalization pipeline from the list of custom pipelines.
Select Add, and then select Save.

You should see the Outbound call normalization dynamic route in the list of dynamic routes.

After you've created the Outbound call normalization pipeline and activated it by creating a dynamic route, you should see that the cardinality of the external calls made by your services has significantly reduced when you check the Outbound calls tab in Services.

Normalize Redis commands

Redis statements often include unique identifiers or values, for example, GET user:12345, GET user:12346, and GET user:12347 or SET order:123, SET order:124, and SET order:125. This high cardinality results in thousands of distinct entries shown in the Database queries view in Services.

Unlike parameterized SQL databases, where OneAgent or OpenTelemetry automatically handles normalization, Redis commands require manual cardinality handling via an OpenPipeline pipeline. In this section, we'll utilize a processing rule to transform these commands into normalized patterns such as GET or SET, making your Redis query data more actionable.

1. Create a pipeline

First, let's create a pipeline that contains a processing rule for reducing the cardinality of your Redis statements. As Services Services utilizes the db.query.text attribute, you'll tweak this particular attribute to normalize the Redis statements displayed in the Database queries view.

To create a pipeline

Go to Settings > Process and contextualize > OpenPipeline > Spans.
Go to the Pipelines tab, and select Pipeline.
Enter DB statement normalization as a pipeline name.
On the Processing tab, select Processor > DQL, and configure a new DQL processor:
- Name: Redis
- Matching condition:
```
db.system == "redis" and (isNotNull(db.statement) or isNotNull(db.query.text))
```
  Explain this matching condition
  db.system == "redis": Match all redis database systems.
  
  (isNotNull(db.statement) or isNotNull(db.query.text)): Require that either the old db.statement or the new db.query.text attribute is used in your Redis database spans.
  
  db.statement is still common for some OpenTelemetry instrumentations.
- DQL processor definition:
```
fieldsAdd db.query.text = coalesce(db.query.text, db.statement)
 | fieldsAdd db.query.text.orig = db.query.text
 | fieldsAdd blankPos = indexOf(db.query.text.orig, " ")
 | fieldsAdd db.query.text = if (blankPos > 0, substring(db.query.text, from: 0, to: blankPos), else: "*")
```
  Explain this processor definition
  fieldsAdd db.query.text = coalesce(db.query.text, db.statement): Make sure that the matching condition works for both the old db.statement and the new db.query.text attribute.
  
  Many OpenTelemetry instrumentations still use db.statement instead of db.query.text. This line ensures that both fields are respected, as coalesce() returns the first non-null argument.
  
  fieldsAdd db.query.text.orig = db.query.text: Store the original value of the db.query.text attribute.
  
  While the low cardinality of db.query.text is important in Services, preserving the original value of the db.query.text attribute is quite beneficial and can be leveraged for further drill-downs.
  
  fieldsAdd blankPos = indexOf(db.query.text.orig, " ") and fieldsAdd db.query.text = if (blankPos > 0, substring(db.query.text, from: 0, to: blankPos), else: "*"): Simplify the db.query.text attribute by extracting the new value from the start of the text up to the first blank space.
  
  A Redis database query is the value until the first blank space of the statement, so this blank space is found, and then only the part until that blank space is kept as db.query.text.
Select Save in the upper-right corner of the page.

You should see the DB statement normalization pipeline in the list of pipelines.

If you have other databases with high-cardinality database statements, add additional DQL processors to your DB statement normalization pipeline. Configure the processors according to your situation. For additional examples, see Explore other DQL processors.

2. Create a dynamic route

To create a dynamic route

On the Spans page, go to the Dynamic routing tab.
Select Dynamic route.
Define your dynamic route:
- Name: DB statement normalization
- Matching condition:
```
isNotNull(db.statement) and (isNotNull(db.statement) or isNotNull(db.query.text))
```
- Pipeline: Select the previously created DB statement normalization pipeline from the list of custom pipelines.
Select Add, and then select Save.

You should see the DB statement normalization dynamic route in the list of dynamic routes.

3. Explore other DQL processorsOptional

As mentioned before, you can add additional DQL processors when you encounter high-cardinality database queries.

Check the sections below for three additional examples of the DQL processor. When you add a new processor, set Matching condition to db.system == "redis" and (isNotNull(db.statement) or isNotNull(db.query.text)), and set DQL processor definition to the value provided in one of the examples below.

Before introducing a new DQL processor, you can use Notebooks Notebooks to check how a DQL processor transforms your data.

Go to Notebooks.
Select Notebook in the app header to create a new notebook.
Open the (Add) menu, and select DQL.
In the query section, enter the required DQL query.
Select Run to execute the DQL query.

Original: Full cardinality

Use the following DQL query to count the original number of Redis statements (before cardinality reduction).

DQL query in Notebooks:

fetch spans
| filter db.system == "redis" and (isNotNull(db.statement) or isNotNull(db.query.text))
| fieldsAdd db.query.text = coalesce(db.query.text, db.operation.name)
| summarize count(), by: { db.query.text }
| sort `count()` desc

Option 1: Abbreviated

This DQL processor replaces db.query.text with the first 15 characters + *. This option is good for quick cardinality reduction regardless of the content format.

DQL processor definition:

| fieldsAdd db.query.text = coalesce(db.query.text, db.operation.name)
| fieldsAdd db.query.text.orig = db.query.text
| fieldsAdd db.query.text = concat(substring(db.query.text, from: 0, to: 15), "*")

DQL query in Notebooks:

fetch spans
| filter db.system == "redis" and (isNotNull(db.statement) or isNotNull(db.query.text))
| fieldsAdd db.query.text = coalesce(db.query.text, db.operation.name)
| fieldsAdd db.query.text.orig = db.query.text
| fieldsAdd db.query.text = concat(substring(db.query.text, from: 0, to: 15), "*")
| summarize count(), by: { db.query.text }
| sort `count()` desc

Option 2: Redis command only

This DQL processor extracts the first word in db.query.text by cutting at the first space. This option is great to get only the Redis commands, for example, GET or SET.

DQL processor definition:

| fieldsAdd db.query.text = coalesce(db.query.text, db.operation.name)
| fieldsAdd db.query.text.orig = db.query.text
| fieldsAdd blankPos = indexOf(db.query.text.orig, " ")
| fieldsAdd db.query.text = if(blankPos > 0, substring(db.query.text, from: 0, to: blankPos), else: "*")

DQL query in Notebooks:

fetch spans
| filter db.system == "redis" and (isNotNull(db.statement) or isNotNull(db.query.text))
| fieldsAdd db.query.text = coalesce(db.query.text, db.operation.name)
| fieldsAdd db.query.text.orig = db.query.text
| fieldsAdd blankPos = indexOf(db.query.text.orig, " ")
| fieldsAdd db.query.text = if(blankPos > 0, substring(db.query.text, from: 0, to: blankPos), else: "*")
| summarize count(), by: { db.query.text }
| sort `count()` desc

Option 3: DB operation name

This DQL processor sets db.query.text to the value of db.operation.name. Use this option when you want to see a Redis operation instead of a database query text.

DQL processor definition:

| fieldsAdd db.query.text = coalesce(db.query.text, db.operation.name)
| fieldsAdd db.query.text.orig = db.query.text
| fieldsAdd db.query.text = if(isNotNull(db.operation.name), db.operation.name, else: "*")

DQL query in Notebooks:

fetch spans
| filter db.system == "redis" and (isNotNull(db.statement) or isNotNull(db.query.text))
| fieldsAdd db.query.text = coalesce(db.query.text, db.operation.name)
| fieldsAdd db.query.text.orig = db.query.text
| fieldsAdd db.query.text = if(isNotNull(db.operation.name), db.operation.name, else: "*")
| summarize count(), by: { db.query.text }
| sort `count()` desc

After you've created the DB statement normalization pipeline and activated it by creating a dynamic route, you should see that the cardinality of your Redis statements has significantly reduced when you check the Database queries view in Services. For example, statements like GET as:1:swuq:abc-677d3b, SET as:1:rl:wf:d1d42f, or DECRBY as:1:paec:wis70158 now appear as GET, SET, and DECRBY. Such normalized Redis queries allow for easier and more effective analysis.

Reduce cardinality of temporary queue names

Services includes the Message processing view that aggregates metrics for messaging operations. High cardinality occurs when temporary queues are created with unique identifiers in their names (such as amq.gen-6dggtCpu, async-job-2jrmsi5y, or orders-reply-2n68vy4g), generating thousands of distinct queue names that make aggregations unusable.

Most instrumentations keep the cardinality of messaging.destination.name low by using non-standard fields like messaging.temp.queue.hash for high-cardinality data or by setting messaging.destination.temporary. However, when instrumentation doesn't follow these practices, OpenPipeline processing rules can normalize temporary queue names into patterns or flag them as temporary.

1. Spot high cardinality

Before implementing OpenPipeline processing rules, query your spans to identify messaging systems with high percentages of unique destination names.

Go to Notebooks and select Notebook in the app header to create a new notebook.
Select New section > DQL.

Copy and paste the following query into the edit box and select Run.

fetch spans
| filter isNotNull(messaging.system) and isNotNull(messaging.destination.name)
| summarize count=count(), distinctCount=countDistinct(messaging.destination.name), by:{messaging.system, messaging.destination.temporary}
| fieldsAdd cardinality_ratio = toDouble(distinctCount) / toDouble(count)

Examine the result for high cardinality ratios.

Messaging systems that show high cardinality ratios (above 0.5) without messaging.destination.temporary set indicate queues that would:
- Result in an excessive number of metrics with minimal analytical value.
- Benefit from normalization as described below.

2. Create a pipeline

Now, let's create a pipeline that contains a processing rule for normalizing temporary queue names into patterns or flagging them as temporary.

To create a pipeline

Go to Settings > Process and contextualize > OpenPipeline > Spans.
Go to the Pipelines tab, and select Pipeline.
Enter Queue handling as a pipeline name.
Choose whether you want to normalize temporary queue names into patterns or flag them as temporary.
To add or override the temporary queue flag (messaging.destination.temporary), select Processor > DQL on the Processing tab, and configure a new DQL processor.
Name: Temporary queue selector

Matching condition:

messaging.destination.temporary == false and
matchesPhrase(messaging.destination.name, "odaRequestQueue*")

This condition matches all messaging spans that were detected as not temporary (where messaging.destination.temporary is set to false) and matches the specific odaRequestQueue* destination pattern that should be overridden to be considered temporary.

DQL processor definition:

fieldsAdd messaging.destination.temporary = true

This definition sets messaging.destination.temporary to true.
To rename messaging destinations, select Processor > Add fields on the Processing tab, and configure a new DQL processor.
Name: Destination renamer

Matching condition:

matchesPhrase(messaging.destination.name, "odaRequestQueue*")

This condition matches all spans related to messaging destinations that start with odaRequestQueue.

Add fields: Enter the following values, and then select Add:

Name: messaging.destination.name
Value: odaRequestQueue

By adding this field, you replace the original content of messaging.destination.name with the odaRequestQueue value. In this case, the original value of messaging.destination.name that ends with consecutive numbers is replaced with a static odaRequestQueue that only contains the first, constant part of the destination name.
Select Save in the upper-right corner of the page.

You should see the Queue handling pipeline in the list of pipelines.

3. Create a dynamic route

Now that you have created a new pipeline and defined the required DQL processor, activate this processor by connecting it to OpenPipeline via a dynamic route. This way, your pipeline can receive the span data.

To create a dynamic route

On the Spans page, go to the Dynamic routing tab.
Select Dynamic route.
Define your dynamic route:
- Name: Queue handling
- Matching condition:
```
matchesPhrase(messaging.destination.name, "odaRequestQueue*")
```
  This condition matches all spans related to messaging destinations that start with odaRequestQueue.
- Pipeline: Select the previously created Queue handling pipeline from the list of custom pipelines.
Select Add, and then select Save.

You should see the Queue handling dynamic route in the list of dynamic routes.

After you've created the Queue handling pipeline and activated it by creating a dynamic route, you should see that queues with high cardinality either have messaging.destination.temporary set to true or feature normalized names, which significantly reduces metric cardinality in the Message processing view in Services. To verify this, follow the instructions in Spot high cardinality.

Congratulations!

You've completed this tutorial. You've learned how to normalize outbound calls, Redis commands, and temporary queue names in different views in Services. Reducing cardinality of these entities makes your data more actionable and allows for easier and more effective analysis.