Optimize log retention and reduce scanned data volume

Latest Dynatrace
Tutorial
10-min read
Published Mar 08, 2026

Grail is capable of scanning petabytes of log data with high performance. However, the more you scan, the higher your log query consumption is. This is even true if the query doesn't return any data, as each scanned log record contributes to the total scanned bytes volume.

Follow this tutorial to reduce the size of retained and scanned log data, while still getting the expected results. This tutorial showcases how to apply some of the best practices for Log Management and Analytics in a real-life scenario.

Who is this for?

This tutorial is intended for Dynatrace administrators setting up log monitoring.

What will you learn?

Apply some of the best practices for Log Management and Analytics to optimize how you retain and scan logs.

In this tutorial, you'll learn how to:

Set up log ingestion.
Use dedicated log buckets.
Filter logs on ingest.
Set up user permissions with boundaries.

Prerequisites

Here are some things to think about before you start so that you can make an effective plan to optimize logs in Dynatrace.

Understand your log ingest sources and volumes

Different sources use different ways to send log data. By estimating your daily ingest volume, you can better decide on data partition and segmentation.

For more about collecting and ingesting data, see Log ingestion.

Identify usage patterns, log types, and retention needs

Classify your log data and think about compliance and privacy requirements.

Classification: Which log types will you use?

Common log types are:
- Application logs: Frequently used, typically for troubleshooting or alerting.
- Audit logs: These must be stored for a longer period of time to fulfill compliance requirements.
  
  They are not regularly used, and are usually accessed by only a few people.
- Network logs: Can be your webserver logs, CDN, network devices.
  
  These have a very high volume, and are therefore important to aggregate.
  
  They are mostly consumed via Dashboards, and are potentially good candidates for Log-based metrics extraction for cost-efficient monitoring and alerting.
- Other log types: Any other logs your system generates, such as those used for troubleshooting, investigations, dashboarding, business analytics, or automations.
Compliance: How long do you need to store logs?

Retention time is defined at the bucket level.
Privacy: Are there specific requirements that demand data masking?

You can redact data on ingest with OneAgent, or during ingest processing with OpenPipeline.

Plan your buckets and permissions models

Grail organizes data in buckets. Buckets behave like folders in a file system and are designed for records that should be handled together. These could be, for example, data that:

Has the same retention period.
Needs to be queried/analyzed together.
Is accessed by the same user group
Serves the same use case.
Needs to be deleted at the same time.

For more about buckets, see Configure data storage and retention for logs.

Background

Let's assume that your organization has already prepared a plan for data segmentation, but hasn't yet configured anything in Dynatrace.

Your organization has two main user groups:

Developers: Responsible for e-commerce applications running on Kubernetes and AWS Lambda.
The CloudsOps team: Responsible for all AWS resources, including AWS Lambda and CloudFront.
The Platform team: Responsible for Kubernetes and On-Prem.
The Security team: Responsible for all audit and security logs.

Dynatrace ingests and retains the following types of log data, described in the table below.

Log type	Source	Daily ingest size	Bucket name	Retention	Relevant user group
Infrastructure logs	Kubernetes system logs monitored with OneAgent (journald)	2 TB	`infra_logs`	90 days	Platform
Application logs	Kubernetes monitored with OneAgent	2 TB	`app_logs`	60 days	Developers and Platform
Application logs	Lambda monitored with Lambda Layer	1 TB	`app_logs`	60 days	Developers and CloudOps
Access logs	CloudFront logs sent via Kinesis	3 TB	`access_logs`	365 days	CloudOps
Audit logs	AWS Resource Audit Logs	2 GB	`audit_logs`	3650 days (10 years)	Security

Set up log ingestion and apply best practices

First, you'll need to set up log ingestion. After that, you'll apply some of the best practices for Log Management and Analytics.

1. Set up log ingestion

To set up log ingestion, follow the steps described in Log ingestion.

By default, all log data is ingested into the default_logs bucket. Ideally, after you have implemented all the best practices, only admins should have access to this bucket. Bucket permissions should follow the principle of least privilege, in which individual users have access to just the buckets that they're required to query or visualize.

2. Verify that you are ingesting and retaining log data

There are two ways that you can verify data is ingested and retained.

The Log ingest overview ready-made dashboard, available in Dashboards, lets you check ingested log volumes.
Use Logs or Notebooks to fetch logs from any bucket and validate if the ingested data arrives correctly and looks as expected. Run the DQL query shown below.
```
fetch logs
| filter dt.system.bucket == "default_logs"
```

If you don't see any log data, see Troubleshooting Log Management and Analytics for troubleshooting tips.

3. Use dedicated buckets

This step creates a dedicated bucket for certain data.

To create a bucket, go to Settings > Storage management and select + Bucket.
Set the bucket name and display name. For this example, set both to access_logs.
Set the retention period, in days. For this example, set the period to 365.
Set the bucket table type. For this example, set the type to logs.
Optional Select Retain with Included Queries and define the included query retention period.

For more info about Retain with Included Queries, see Take control of log query costs using Retain with Included Queries.
Select Create to save the bucket.

4. Filter logs on ingest

OpenPipeline handles log ingestion from all sources and allows processing, transformation and bucket assignment before logs are stored in Grail.

For this example, let's use OpenPipeline to filter logs on ingest. We'll configure a pipeline that processes CloudFront logs and stores them in the access_logs bucket.

1. Set up the pipeline

Go to OpenPipeline and select Logs.
In the Pipelines tab, select + Pipeline to create a new pipeline.
Name the pipeline AWS CloudFront logs.
Add technology bundle processors.
- In the Processing tab, select + Processor > Technology bundle.
- Open the AWS group and select AWS Common.
- Select Choose.
- Select + Processor > Technology bundle again.
- Open the AWS group and this time select Amazon CloudFront.
- Select Choose.
Select Save to save the configuration.
Get the pipeline ID, which you'll need to filter logs later.
- In the Pipelines tab, select the AWS CloudFront logs pipeline.
Depending on how many pipelines are configured, you may need to select > to get to the right page.
- The pipeline ID is visible immediately underneath the pipeline's title.
In our example, the ID might be pipeline_AWS_cloudfront_logs_5498.

2. Set up dynamic routing

While still in OpenPipeline, open the Dynamic routing tab and then select + Dynamic route.
Enter a name.
Set the matching condition to matchesValue(aws.log_stream, "CloudFront_*").
Use the drop-down to select the AWS CloudFront logs pipeline that you just created.
Select Add to create the dynamic route.
Select Save to save your changes to the table.

3. Verify ingestion

To verify the configuration, go to Logs or Notebooks Notebooks and run the following query. This checks if the pipeline is processing the most recently ingested logs.

fetch logs
| filter dt.openpipeline == {pipeline_AWS_cloudfront_logs_5498}

4. Assign logs to the bucket

While still in OpenPipeline, open the Storage tab and select + Processor > Bucket assignment.
Set the bucket's name.

For this example, set the name to AWS access logs.
Leave the matching condition set to true.
In the Storage drop-down menu, select the access_logs bucket you already created.
Select Save.

5. Set up user permissions with boundaries

This step grants users access to only specific buckets.

1. Create a new boundary

Open Account Management > Identity & access management > Policy management and then select the Boundaries tab.
Select + Boundary to create a new boundary.
- Enter a boundary name. For this example, use access_log read.
- Set the boundary query. For this example, use storage:bucket-name = "access_logs";.
Select Save.

2. Create a new user group

Open Account Management > Identity & access management > Group management.
Select + Group to create a new group.

For this example, name the new group CloudOps.
Select Create to create the group.

The View group page appears.
Select + Permission to add a new permission.
- Use the drop-down menu to select the Read Logs permission.
- Under Scope, set the appropriate scope at the account or environment level.
- Under Boundaries, use the drop-down menu to select the access_logs read boundary that you previously created.
Select Save.

3. Assign users to the group

Open Account Management > Identity & access Management > User Management.
For each user that you want to assign to the CloudOps group, select > Edit.
Select the checkbox next to the CloudOps group.

You may need to search for the group using the Filter groups text field.
Select Save to save that user assignment.
Continue with all other users, as appropriate.
When you have assigned all relevant users, you can close the window or continue to use Dynatrace.

Congratulations!

You've completed this tutorial. You now have an efficient log setup:

You've reduced the amount of data that you need to scan by using dedicated buckets.
Non-relevant logs are either sent to a different bucket or deleted.
Your users now have access only for the relevant buckets.