Logs are stored in Grail buckets with a retention period from 10 days to 10 years. By default, log data is stored for 35 days in the default_logs bucket.
Learning outcome
After completing this tutorial, you'll be able to:
Store log data in a custom bucket for a specific user group or with a longer retention period up to 10 years.
Skip storage of log data from a specific ingest source or based on matching conditions.
Manage how queries are billed for a log bucket.
Target audience
This tutorial is intended for Site Reliability Engineers (SREs) and architects who want to configure storage and retention settings for access control, optimization, or compliance purposes.
openpipeline:configurations:write and openpipeline:configurations:read permissions. To learn how to set up the permissions, see Permissions in Grail.
Example 1: Retain logs for three years
Using buckets can improve query performance by reducing query execution time and the scope of data read. With this procedure, you create a new bucket with a custom retention period for your log data. Log records that match the route and the pipeline conditions are stored according to the chosen bucket retention period and are readable to users based on permissions.
Using a custom log bucket, you can:
Store log data with the same retention period.
Store log data that needs to be queried and analyzed together.
Store log data that needs to be deleted at the same time.
Go to Settings > Process and contextualize > OpenPipeline > Logs >Pipelines; choose an existing pipeline or create a new one.
In the Storage stage, select Processor > No storage assignment.
Enter the processor name and matching condition.
Select Save.
Make sure your pipeline is receiving records via a dynamic route.
Go to Dynamic routing.
Choose an existing dynamic route or create a new one.
Define the route by entering a route name, a matching condition (for example true), and the target pipeline name.
Select Save.
Example 3: Manage query billing per bucket
There are two retention models that you can configure on a per-bucket basis:
Usage-based: Each query execution is charged separately.
Retain with Included Queries: Log data for the defined timeframe is included in the retention cost. Querying this data does not incur additional costs.
Buckets are the foundation of log management—set them up right to avoid data silos and optimize retention. A few best practices can make a big difference in performance and cost.