DQL Metric commands

timeseries

The timeseries command is a starting command of DQL. It combines loading, filtering and aggregating metrics data into a time series output.

Syntax

timeseries [column =] aggregation(metricKey [, default:] [, rollup:] [, rate:]) [, [column =] aggregation(metricKey, ...), ...] [, by:] [, filter:] [, union:] [, nonempty:] [, interval: | bins:] [, from:] [, to:] [,timeframe:] [,shift:] [, scalar:]

Basic examples

Example 1
timeseries usage=avg(dt.host.cpu.usage)
Example 2
timeseries min_cpu=min(dt.host.cpu.usage), max(dt.host.cpu.usage, default:99.9), by:dt.entity.host, filter:in(dt.entity.host, "HOST-1", "HOST-2"), interval:1h, from:-7d

Timeseries response

The timeseries command produces homogenous time series of aggregated data: all series have identical start and end timestamps, time interval and number of elements. The timeframe column holds the start and end timestamps. The interval column holds the time interval expressed as a duration. Each aggregation (min, max, sum, avg) produces a column with the specified column name or a name derived from the aggregation expression. Each aggregation cell consists of the entire array of aggregated values for each timeslot defined by timeframe and interval.

Here is an example of the result of the timeseries command. Note that:

  • the first aggregation column name has been specified in the query (min_cpu)
  • the second aggregation column name has not been specified in the query, hence the name is derived from the expression (max(dt.host.cpu.usage))
  • the first aggregation does not specify a default parameter, hence it can contain null for empty time slots
  • the second aggregation does specify a default parameter, hence the empty time slots are replaced with the default value (99.9 in this example)
timeframe
interval
dt.entity.host
min_cpu
max(dt.host.cpu.usage)
{"start":"2022-10-24T07:00:00","end":"2022-10-31T07:00:00"}
"1h"
HOST-1
[35.1,35.9,35.5,36.7,...,37.9,39.4]
[36.9,37.8,38.8,38.8,...,38.6,39.5]
{"start":"2022-10-24T07:00:00","end":"2022-10-31T07:00:00"}
"1h"
HOST-2
[24.9,25.1,null,25.0,...,23.8,24.5]
[30.9,31.3,99.9,32.7,...,33.1,37.1]

Aggregation functions

Six aggregation functions are available to use with the timeseries command. These functions are:

  • sum - Calculates the sum of the expression values in each bucket.
  • avg - Calculates the average of the expression values in each bucket.
  • min - Calculates the minimum of the expression values in each bucket.
  • max - Calculates the maximum of the expression values in each bucket.
  • count - Counts the number of records in each bucket.
  • percentile - Calculates the requested percentile of the expression value in each bucket.

It is also possible to use the following functions. However, they need to be used with one of the above aggregation functions:

  • start - Generates the start timestamp of the bin.
  • end - Generates the end timestamp of the bin.
Syntax

These functions have the following syntax:

  • sum(metric.key [, rollup] [, default] [, rate])
  • avg(metric.key [, rollup] [, default] [, rate])
  • min(metric.key [, rollup] [, default] [, rate])
  • max(metric.key [, rollup] [, default] [, rate])
  • count(metric.key [, default])
  • percentile(metric.key, percentile [, rollup] [, default] [, rate])
  • start()
  • end()
Parameters
ParameterTypeDescriptionRequired

metric key

metric key identifier

The metric key the series should be created for.

required

rollup

enumeration

The time rollup that should be used for the aggregation. Can be min, max, sum, avg, total.

optional

default

number

The default value that should be used to fill gaps/empty bins. The default value is null.

optional

rate

duration

The duration that shall be used to adjust the bin values using the following formula: (binValue / interval) * rate.

optional

percentile

double, long

The nth-percentile, such that approximately n percent of observed measurements fall below this value. Must be between 0 and 100.

required

scalar

boolean

Flag to indicate that a single scalar value spanning the whole timeframe shall be calculated. Can be used on any aggregation function except start and end.

optional

filter

boolean

An additional filter condition that shall be applied on the source records before time-/space-aggregation. If all aggregations are filtering on the same condition, the global filter parameter can be used over specifying the same condition in each aggregation individually.

optional

Example 1
timeseries sum(dt.host.availability)
Example 2
timeseries avg(dt.host.cpu.idle, rollup: avg, default: 0, rate: 10m)
Example 3
timeseries min(dt.host.cpu.iowait, default: -1), by: dt.entity.host
Example 4
timeseries max(dt.host.cpu.iowait, default: -1), by: dt.entity.host
Example 5
timeseries count(dt.host.cpu.load), by: dt.entity.host
Example 6
timeseries time=percentile(dt.service.request.response_time, 99.9),
by:{dt.entity.service}
| limit 5
Example 7
timeseries min(dt.host.cpu.load), start(), end()
Example 8
timeseries requests=sum(dt.service.request.count),
requests_overall=sum(dt.service.request.count, scalar:true),
by:{dt.entity.service}
Example 9
timeseries failed=sum(dt.service.request.count, filter:failed==true),
all=sum(dt.service.request.count),
by:{dt.entity.service}
| fieldsAdd failed_ratio=failed[]/all[]*100

Aggregating multiple timeseries

Multiple timeseries can be aggregated using the summarize command with an iterative expression in the aggregation function. The min, max, sum and avg aggregation functions support iterative expressions.

Example

This example shows how to use the aggregation functions with iterative expressions.

timeseries min=min(dt.host.cpu.usage), by: dt.entity.host
| summarize sum(min[]) / 2

Percentile function

When used with the timeseries command, the percentile function returns an estimated percentile. The estimates are not guaranteed to be exact but are accurate to ~2.2%. For example, if the estimated 90th percentile is 679 ms, then the exact, actual 90th percentile is between 664 - 693 ms.

In exceptional circumstances, a higher error guarantee may be necessary to optimize for exceptionally skewed distributions. Two metrics support timeseries percentile without a rollup parameter:

  • dt.service.request.response_time
  • dt.service.request.service_mesh.response_time

All other metrics must use the rollup parameter, for example, to calculate the 90th percentile average CPU usage:

timeseries percentile(dt.host.cpu.usage, 90, rollup:avg)

When you use the percentile function for buckets from histograms ingested via OTLP or from Prometheus:

  • It isn't guaranteed to align with the PromQL implementation.
  • Percentile estimates cannot be smaller than float.MIN_VALUE or larger than float.MAX_VALUE.

Default value for empty time slots

The timeseries command produces homogenous time series of aggregated data: all series have identical start and end timestamps, time interval and number of elements. If data is missing for a particular time slot, it is filled with null. Specifying a default parameter fills empty time slots with the default parameter value instead of null.

The timeseries command might not return any data. This can be a challenge when combining two metrics—for example, when calculating the percentage of all HTTP responses that are HTTP 503 responses. If the timeseries command returns no records (no HTTP 503 responses were found), the expected result is 0%, but the actual result is empty, because, the default value of a nonempty parameter is false.

Example 1
timeseries http_503=sum(http_requests),
filter:{code==503}
| join on:interval, [timeseries http_total=sum(http_requests)], fields:{http_total}
| fieldsAdd ratio=http_503[]/http_total[]*100

Query result:

http_503http_totalratio

No records

Example 2

To achieve the desired result, you can combine the nonempty and default parameters.

timeseries http_503=sum(http_requests, default:0), filter:{code==503}, nonempty:true
| join on:interval, [timeseries http_total=sum(http_requests)], fields:{http_total}
| fieldsadd ratio=http_503[]/http_total[]*100

Query result:

http_503
http_total
ratio
[0,0,0,0]
[1,5,2,4]
[0,0,0,0]

Rate normalization

The rate parameter divides the aggregated timeseries by the interval to normalize the timeseries to the selected specified duration. For instance, if timeseries sum(dt.requests.failed) returns [300,300,600,900] with a 5m interval, then timeseries sum(dt.requests.failed, rate:1s) would return [1,1,2,3].

Join behavior (union parameter)

The union parameter controls the way multiple series are combined when series are absent from one or more columns. The default is union:false meaning only series that are present on all columns are returned (equivalent to an intersection of the results, or SQL's INNER JOIN). Specifying union:true results in all matching series with possibly empty columns (equivalent to an union of the results, or SQL's OUTER JOIN).

For example, assuming two metrics dt.requests.failed and dt.requests.success and 3 hosts,

Results of timeseries failed=sum(dt.requests.failed), by:host

host
failed
HOST-1
[1,1,1,1]
HOST-2
[2,2,2,2]

Results of timeseries success=sum(dt.requests.success), by:host

host
success
HOST-2
[20,20,20,20]
HOST-3
[30,30,30,30]

Results of timeseries failed=sum(dt.requests.failed), success=sum(dt.requests.success), by:host

host
failed
success
HOST-2
[2,2,2,2]
[20,20,20,20]

Results of timeseries failed=sum(dt.requests.failed), success=sum(dt.requests.success), by:host, union:true

host
failed
success
HOST-1
[1,1,1,1]
[null,null,null,null]
HOST-2
[2,2,2,2]
[20,20,20,20]
HOST-3
[null,null,null,null]
[30,30,30,30]

Time interval

The timeseries command automatically calculates an appropriate time interval derived from the query timeframe. The timeframe is divided into time slots of identical time intervals, and data is then rolled up into each of these time slots so that the number of points per time series is suitable for graphing. For instance, to graph a metric over a 1-day timeframe, it is more manageable to use 10-minute interval data (144 points) than 1-minute interval data (1,440 points).

You can influence the calculated time interval by specifying either a custom interval parameter or, via the bins parameter, the desired number of time slots.

Interval and bins parameters

The interval and bins parameters are exclusive. Both parameters are optional.

If specified, the bins parameter (range: 12–1,500) is used to calculate an equivalent nominal time interval. A minimum bins count of 12 is applied to calculate the nominal time interval.

The nominal time interval, whether coming from the bins or interval parameter, is adjusted so that:

  • The time interval matches a well-known interval (1 minute, 5 minutes, 10 minutes, 15 minutes, 30 minutes, or 1 hour)
  • The time interval does not exceed the maximum number of elements per series (1,500).

Data points are aggregated across time into time slots to deal with potentially large amounts of data. For instance, a day's data is combined into 10-minute time slots. This aggregation is called a rollup and happens in every timeseries query.

The aggregation function used to combine the data is dependent on the function used in the aggregation. For instance, assuming a metric with a host dimension, timeseries min(dt.host.cpu.usage) combines data into time slots using the min function for each time slot and each host, and then aggregates using the min function again across all hosts within each time slot, effectively performing a min of the min as expected.

Example

The rollup parameter can be used if it's necessary to specify a time aggregation function independently from the main aggregation, such as the average of the sums in the following example.

timeseries failed = avg(dt.requests.failed, rollup:sum)

The rollup parameter supports the following functions: min, max, sum, avg, and total.

Timeseries shift parameter allows to compare two different timeframes for a metric series.

The shift parameter shifts the timeframe specified in the query parameters and maps the resulting data points to timestamps from the original timeframe.

A positive argument shifts the timeframe into the future; a negative argument shifts the timeframe into the past.

For example, a timeframe from March 12, 2021 13:00 CET to 15:00 CET and a time shift of -1d (one day into the past) will result in the data points being queried for the timeframe from March 11, 2021 13:00 CET to 15:00 CET. Timestamps in the response will be aligned to the original timeframe. For example, the data point with a timestamp of March 11, 2021 13:30 CET will be returned as March 12, 2021 13:30 CET.

Example

In this example, we compare two timeframes.

timeseries avail=avg(dt.host.disk.avail), by:{dt.entity.host}, from:-24h
| append [
timeseries avail.7d=avg(dt.host.disk.avail), by:{dt.entity.host}, shift:-7d
]