Template

An example article for commands.

fetch

Loads data from the specified resource.

Syntax

fetch dataObject [, from:<from>] [, to:<to>] [, timeframe:<timeframe>] [, samplingRatio:<samplingRatio>] [, scanLimitGBytes:<scanLimitGBytes>]

TBD. Details about the syntax description here. (Included as snippet)

Parameters

Parameter

Type

Description

Required

dataObject

data object

The data object to fetch data for.

required

from

timestamp, long, duration, string

The start of the timeframe (if no explicit timeframe is specified). A duration is interpreted as an offset from now().

optional

to

timestamp, long, duration, string

The end of the timeframe (if no explicit timeframe is specified). A duration is interpreted as an offset from now().

optional

timeframe

timeframe, string

The desired timeframe (if not specified, global timeframe is used).

optional

samplingRatio

double, long

The desired sampling ratio.

optional

scanLimitGBytes

long

The maximum number of gigabytes that shall be scanned during loading data.

optional

Basic examples

Example 1: Fetch data from logs

Here is an example of the fetch command in its simplest form.

fetch logs

Query timeframes

All duration literals valid for the duration data type are applicable for the from and to parameters.

Example 2: Relative timeframe with defined start time

This example with relative time ranges uses DQL's time literals to query logs from the last 25 minutes. The duration is automatically interpreted as an offset of the now() function.

fetch logs, from:-25m
Example 3: Relative timeframe with defined start and end time

Using both the from and to parameters allows you to adjust the start and end timestamp of the query. The duration is automatically interpreted as an offset of the now() function.

fetch logs, from:-24h, to:-2h
Example 4: Absolute time range

You can also use absolute time ranges with the timeframe parameter.

fetch logs, timeframe:"2021-10-20T00:00:00Z/2021-10-28T12:00:00Z"

Sampling

Currently, to improve query performance, sampling is applicable for Log data within the initial fetch pipeline stage. Sampling happens vertically across the data, resulting in the selection of a subset of Log records, according to the specified, optional samplingRatio parameter.

The applicable value ranges for sampling are:

  • 1: Default value, resulting in no applied sampling.
  • 10
  • 100
  • 1000
  • 10000

Depending on the specified value, 1/<samplingRatio> of available raw Log records are returned.

The selected samplingRatio is reported in the query result for each record through dt.system.sampling_ratio, which is a hidden field. To see the hidden field, you need to select it via the fields command.

Sampling in practice

Sampling is non-deterministic, and will return a different result set with each query run. Also, all the following commands will work based on the sampled set of input data, yielding unprecise aggregates.

Furthermore, result sets may vary greatly with different samplingRatio values. This is the nature of sampling, as a high sampling ratio is more likely to leave out low-frequency logs. For example, if you had one ERROR log among millions of INFO logs, filter loglevel == "ERROR" would very likely return an empty result set for any sampled data.

Example 5: Estimate number of occurances in logs

In this example, an estimation of the occurrences of ERROR Logs happens across the last 7 days. As a final step, an interpolation of the countIf() aggregation is added by multiplying and overriding the aggregation result with the selected value of the sampling parameter.

fetch logs, from:now()-7d, samplingRatio:100
| summarize c = countIf(loglevel == "ERROR")
| fieldsAdd c = c*100

Read data limit

The optional scanLimitGBytes parameter controls the amount of uncompressed data to be read by the fetch stage. The default value is 500GB unless specified otherwise. If set to -1, all data available in the query time range is analyzed.