The timeseries command is a starting command of DQL. It combines loading, filtering and aggregating metrics data into a time series output.
timeseries [column =] aggregation(metricKey [, filter:] [, default:] [, rollup:] [, rate:] [, scalar:]) [, [column =] aggregation(metricKey, ...), ...] [, by:] [, filter:] [, union:] [, nonempty:] [, interval: | bins:] [, from:] [, to:] [,timeframe:] [,shift:] [,bucket: bucket, …]
timeseries usage=avg(dt.host.cpu.usage)
timeseries min_cpu=min(dt.host.cpu.usage), max(dt.host.cpu.usage, default:99.9), by:dt.entity.host, filter:in(dt.entity.host, "HOST-1", "HOST-2"), interval:1h, from:-7d
The timeseries command produces homogenous time series of aggregated data: all series have identical start and end timestamps, time interval and number of elements. The timeframe column holds the start and end timestamps. The interval column holds the time interval expressed as a duration. Each aggregation (sum, avg, min, max, count, percentile and countDistinct)
produces a column with the specified column name or a name derived from the aggregation expression. Each aggregation cell consists of the entire array of aggregated values for each time slot defined by timeframe and interval.
Here is an example of the result of the timeseries command. Note that:
min_cpu)max(dt.host.cpu.usage))default parameter, hence it can contain null for empty time slotsdefault parameter, hence the empty time slots are replaced with the default value (99.9 in this example)| timeframe | interval | dt.entity.host | min_cpu | max(dt.host.cpu.usage) |
|---|---|---|---|---|
{"start":"2022-10-24T07:00:00","end":"2022-10-31T07:00:00"} | "1h" | HOST-1 | [35.1,35.9,35.5,36.7,...,37.9,39.4] | [36.9,37.8,38.8,38.8,...,38.6,39.5] |
{"start":"2022-10-24T07:00:00","end":"2022-10-31T07:00:00"} | "1h" | HOST-2 | [24.9,25.1,null,25.0,...,23.8,24.5] | [30.9,31.3,99.9,32.7,...,33.1,37.1] |
Seven aggregation functions are available to use with the timeseries command. These functions are:
sum - Calculates the sum of the metric in each time slot.avg - Calculates the average of the metric in each time slot.min - Calculates the minimum of the metric in each time slot.max - Calculates the maximum of the metric in each time slot.count - Calculates the number of distinct metric series in each timeslot (cardinality).percentile - Calculates the percentile of the metric in each time slot. It applies to metrics of type histogram. If the metric is a gauge or a count, the rollup parameter needs to be specified.countDistinct - Calculates an approximate count of distinct values in each timeslot. It applies to metrics of type cardinality.It is also possible to use the following functions. However, they need to be used with one of the above aggregation functions:
start - Generates the start timestamp of the bin.end - Generates the end timestamp of the bin.These functions have the following syntax:
sum(metric.key [, filter:] [, default:] [, rollup:] [, rate:] [, scalar:])avg(metric.key [, filter:] [, default:] [, rollup:] [, rate:] [, scalar:])min(metric.key [, filter:] [, default:] [, rollup:] [, rate:] [, scalar:])max(metric.key [, filter:] [, default:] [, rollup:] [, rate:] [, scalar:])count(metric.key [, filter:] [, default:] [, scalar:])percentile(metric.key, percentile [, filter:] [, default:] [, rollup:] [, rate:] [, scalar:])countDistinct(metric.key [, filter:] [, default:] [, scalar:])start()end()| Parameter | Type | Description | Required |
|---|---|---|---|
metric key | metric key identifier | The metric key the series should be created for. | required |
rollup | enumeration | The time rollup that should be used for the aggregation. Can be | optional |
default | number | The default value that should be used to fill empty time slots. The default value is null. | optional |
rate | duration | The duration that should be used to adjust the values using the following formula: | optional |
percentile | double, long | The nth-percentile, such that approximately n percent of observed measurements fall below this value. Must be between 0 and 100. | required |
scalar | boolean | Flag to indicate that a single value spanning the whole timeframe shall be calculated. Can be used on any aggregation function except | optional |
filter | boolean | An additional filter condition that shall be applied on the source records before time-/space-aggregation. If all aggregations are filtering on the same condition, the global | optional |
timeseries sum(dt.host.availability)
timeseries avg(dt.host.cpu.idle, rollup: avg, default: 0, rate: 10m)
timeseries min(dt.host.cpu.iowait, default: -1), by: dt.entity.host
timeseries max(dt.host.cpu.iowait, default: -1), by: dt.entity.host
timeseries count(dt.host.cpu.load), by: dt.entity.host
timeseries time=percentile(dt.service.request.response_time, 99.9),by:{dt.entity.service}| limit 5
timeseries countDistinct(dt.frontend.user.active.estimated_count)
timeseries min(dt.host.cpu.load), start(), end()
Multiple timeseries can be aggregated using the summarize command with an iterative expression in the aggregation function. The min, max, sum and avg aggregation functions support iterative expressions.
This example shows how to use the aggregation functions with iterative expressions.
timeseries usage=avg(dt.host.cpu.usage), by: dt.entity.host| fieldsAdd entityName(dt.entity.host)| filter dt.entity.host.name == "EasyTrade"| summarize usage=avg(usage[]), by:{timeframe, interval}
When used with the timeseries command, the percentile aggregation returns an estimated percentile. The estimates are not guaranteed to be exact but are accurate to ~2.2%. For example, if the estimated 90th percentile is 679 ms, then the exact, actual 90th percentile is between 664 - 693 ms.
In exceptional circumstances, a higher error guarantee may be necessary to optimize for exceptionally skewed distributions.
Two metrics support timeseries percentile without a rollup parameter:
dt.service.request.response_timedt.service.request.service_mesh.response_timeAll other metrics must use the rollup parameter, for example, to calculate the 90th percentile average CPU usage:
timeseries percentile(dt.host.cpu.usage, 90, rollup:avg)
When you use the percentile function for buckets from histograms ingested via OTLP or from Prometheus:
float.MIN_VALUE or larger than float.MAX_VALUE.The timeseries command produces homogenous time series of aggregated data: all series have identical start and end timestamps, time interval and number of elements. If data is missing for a particular time slot, it is filled with null. Specifying a default parameter fills empty time slots with the default parameter value instead of null.
The timeseries command might not return any data. This can be a challenge when combining two metrics—for example, when calculating the percentage of all HTTP responses that are HTTP 503 responses. If the timeseries command returns no records (no HTTP 503 responses were found), the expected result is 0%, but the actual result is empty, because, the default value of a nonempty parameter is false.
timeseries http_503=sum(http_requests),filter:{code==503}| join on:interval, [timeseries http_total=sum(http_requests)], fields:{http_total}| fieldsAdd ratio=http_503[]/http_total[]*100
Query result:
| http_503 | http_total | ratio |
|---|---|---|
No records | ||
To achieve the desired result, you can combine the nonempty and default parameters.
timeseries http_503=sum(http_requests, default:0), filter:{code==503}, nonempty:true| join on:interval, [timeseries http_total=sum(http_requests)], fields:{http_total}| fieldsadd ratio=http_503[]/http_total[]*100
Query result:
| … | http_503 | http_total | ratio |
|---|---|---|---|
| … | [0,0,0,0] | [1,5,2,4] | [0,0,0,0] |
The rate parameter divides the aggregated timeseries by the interval to normalize the timeseries to the selected specified duration. For instance, if timeseries sum(dt.requests.failed) returns [300,300,600,900] with a 5m interval, then timeseries sum(dt.requests.failed, rate:1s) would return [1,1,2,3].
The union parameter controls the way multiple series are combined when series are absent from one or more columns. The default is union:false meaning only series that are present on all columns are returned (equivalent to an intersection of the results, or SQL's INNER JOIN). Specifying union:true results in all matching series with possibly empty columns (equivalent to an union of the results, or SQL's OUTER JOIN).
For example, assuming two metrics dt.requests.failed and dt.requests.success and 3 hosts,
Results of timeseries failed=sum(dt.requests.failed), by:host
| host | failed |
|---|---|
HOST-1 | [1,1,1,1] |
HOST-2 | [2,2,2,2] |
Results of timeseries success=sum(dt.requests.success), by:host
| host | success |
|---|---|
HOST-2 | [20,20,20,20] |
HOST-3 | [30,30,30,30] |
Results of timeseries failed=sum(dt.requests.failed), success=sum(dt.requests.success), by:host
| host | failed | success |
|---|---|---|
HOST-2 | [2,2,2,2] | [20,20,20,20] |
Results of timeseries failed=sum(dt.requests.failed), success=sum(dt.requests.success), by:host, union:true
| host | failed | success |
|---|---|---|
HOST-1 | [1,1,1,1] | [null,null,null,null] |
HOST-2 | [2,2,2,2] | [20,20,20,20] |
HOST-3 | [null,null,null,null] | [30,30,30,30] |
The timeseries command automatically calculates an appropriate time interval derived from the query timeframe. The timeframe is divided into time slots of identical time intervals, and data is then rolled up into each of these time slots so that the number of points per time series is suitable for graphing. For instance, to graph a metric over a 1-day timeframe, it is more manageable to use 10-minute interval data (144 points) than 1-minute interval data (1,440 points).
You can influence the calculated time interval by specifying either a custom interval parameter or, via the bins parameter, the desired number of time slots.
The interval and bins parameters are exclusive. Both parameters are optional.
If specified, the bins parameter (range: 12–1,500) is used to calculate an equivalent nominal time interval. A minimum bins count of 12 is applied to calculate the nominal time interval.
The nominal time interval, whether coming from the bins or interval parameter, is adjusted so that:
Data points are aggregated across time into time slots to deal with potentially large amounts of data. For instance, a day's data is combined into 10-minute time slots. This aggregation is called a rollup and happens in every timeseries query.
The aggregation function used to combine the data is dependent on the function used in the aggregation. For instance, assuming a metric with a host dimension, timeseries min(dt.host.cpu.usage) combines data into time slots using the min function for each time slot and each host, and then aggregates using the min function again across all hosts within each time slot, effectively performing a min of the min as expected.
The rollup parameter can be used if it's necessary to specify a time aggregation function independently from the main aggregation, such as the average of the sums in the following example.
timeseries failed = avg(dt.requests.failed, rollup:sum)
The rollup parameter supports the following functions: min, max, sum, avg, and total.
Timeseries shift parameter allows to compare two different timeframes for a metric series.
The shift parameter shifts the timeframe specified in the query parameters and maps the resulting data points to timestamps from the original timeframe.
A positive argument shifts the timeframe into the future; a negative argument shifts the timeframe into the past.
For example, a timeframe from March 12, 2021 13:00 CET to 15:00 CET and a time shift of -1d (one day into the past) will result in the data points being queried for the timeframe from March 11, 2021 13:00 CET to 15:00 CET. Timestamps in the response will be aligned to the original timeframe. For example, the data point with a timestamp of March 11, 2021 13:30 CET will be returned as March 12, 2021 13:30 CET.
In this example, we compare two timeframes.
timeseries avail=avg(dt.host.disk.avail), by:{dt.entity.host}, from:-24h| append [timeseries avail.7d=avg(dt.host.disk.avail), by:{dt.entity.host}, shift:-7d]
You need the corresponding storage:metrics:read and storage:buckets:read permissions to read metrics data using the timeseries command. For details, see Permissions in Grail.