This page provides best practices for Log Management and Log Analytics powered by Grail. Once you've read this page, you'll have the knowledge to optimize how you retain and scan logs—and therefore reduce your costs—while still getting the results you expect.
Here are some best practices that you can use to reduce the size of retained and scanned data.
Use dedicated buckets
Use the default_logs bucket as your playground
Optimize bucket size
Configure bucket retention periods
Filter logs on ingest
Use bucket filters
Set access permissions
Use log-based events and metrics
Use apps with logs in context
Use DQL best practices
Track adoption and usage
By following these best practices, you can:
To learn how to apply some of these best practices in a real-life scenario, see Optimize log retention and reduce scanned data volume. Also, we recommend checking some of the prerequisites before trying to apply the best practices described on this page.
Alternatively, you can also watch a recorded webinar where we discuss these best practices.
By using dedicated buckets to separate your data, you can reduce the amount of data that you need to scan to get the relevant results.
By default, a single query can scan up to 500 GB of data. But how many log records does this represent?
Creating buckets can help to separate data, but too many buckets can make it cumbersome to access log data.
default_logs bucket as your playground By default, all log records are sent to the default_logs bucket. Once you start making other buckets, you can direct certain log records to those buckets.
Then, the only log records that end up in the default_logs bucket are those that you haven't specifically routed to another bucket. This usually–but not always–means that the default_logs bucket has log records that you don't need to preserve.
At this point you can treat the default_logs bucket as your playground:
If you intentionally use the default bucket for onboarding new data, a good practice is always to keep the bucket empty. Therefore, if you see new logs in that bucket, you will know that you are ingesting logs which aren't assigned to a specific bucket.
For most use cases, try to keep the volume of daily retained data in a single bucket to around 2–3 TB. This is especially true for frequently queried buckets. (However, it is usually not possible for buckets used to address compliance use cases, where you'll likely retain petabytes worth of log records in a single bucket.)
This will help to ensure the best user experience and performance, especially if users don't follow DQL best practices (such as applying specific filters for time spans or buckets, or increasing the query volume limit with the scanLimitGBytes parameter).
You can set different retention periods for each bucket. This allows you to optimize buckets for individual retention periods, compliance, and cost.
For example:
Log records can be stored from one day up to 10 years. The retention period is defined when you create a bucket, and can be re-configured at any time. For more information about retention periods, see Data retention periods: Log Management and Analytics.
You can filter logs so that non-relevant logs are either sent to a different bucket or deleted outright. To filter logs on ingest, use either OneAgent (see Log ingest rules) or OpenPipeline (see OpenPipeline processing examples).
Bucket filters are like permissions on the query level. By adding a bucket filter to the query, you can restrict the DQL query to scan a single bucket, regardless of which buckets the user can access.
This reduces the amount of scanned data and the associated costs, especially with queries used in auto-refreshing dashboards.
Additionally, you can use segments to provide easy filtering by bucket, see Segment logs by bucket.
For more information about bucket filters, see Query and filter logs.
By default, a DQL query will scan all buckets that the user has access to. To limit the number and kind of buckets that a user has access to, you can use IAM policies to set access permissions on individual bucket level.
This way you don't have to define bucket filters manually, with every query.
Policy boundaries in Dynatrace are a modular and reusable way to define access conditions for resource and record-level permissions. They act as an additional layer of control, refining scope of permissions granted by IAM policies without the need to create additional specific policies.
By externalizing access conditions, policy boundaries simplify management, ensure consistent enforcement, and improve scalability across large environments. This way, you can assign individual IAM policies to multiple buckets at the same time.
For more information about access permissions, see the following page:
You can create events and metrics from log records.
To convert log queries to log-based metrics, see Optimize performance and costs of dashboards running log queries. After you've extracted metrics, you can delete the log records–this is especially useful for aggregated information where access to the raw record isn’t important.
You can use log-based events and metrics for alerting, instead of log queries. For more information, see Set up alerts based on events extracted from logs and Set up custom alerts based on metrics extracted from logs.
Some apps, such as Kubernetes, let you see logs in context. This lets you scan only the logs that are relevant to a specific use case. For more information, see Use logs in context to troubleshoot Kubernetes (K8s) issues.
Viewing logs in context using the Dynatrace apps is zero rated and therefore free of charge. This includes features like surrounding logs (viewing related log entries) and drill-down views, for example, changing from a trace view to a topology view.
For your log records, you can additionally utilize the following Dynatrace apps:
Since you use DQL to access log records, follow DQL best practices to create optimized queries.
For more information, see DQL best practices.
The best way to learn about usage and adoption is with Dynatrace ready-made dashboards. You can find these in
Dashboards > Ready-made.
Use the dashboards to learn more about consumption, ingested and retained volumes, query patterns, bucket utilization, and more.