Data source commands
data
The data command generates sample data during query runtime. It is intended to test and document query scenarios based on a small, exemplary dataset.
-
Based on an input according to the DQL record datatype, or passing a valid JSON string, a tabular list of records is returned.
-
The data command is a starting command which can be used without a pipeline input.
Syntax
data [ records ] [, json: json_string ]
Parameters
Parameter
Type
Description
Required
records
record expressions
A list of record expressions. Either records or JSON has to be specified.
optional
json
string
A string that defines either a single JSON object or a JSON array. Either records or JSON has to be specified.
optional
Basic examples
Example 1: Create records
In this example, the data
command generates three heterogeneous records.
data record(a = "DQL", b = 1, c = 0),record(a = "Dynatrace Query Language", b = 2.9, e = "1"),record()
Query result:
DQL
1
0
Dynatrace Query Language
2.9
1
Example 2: Create records from json
The following example generates records based on a JSON input.
The use of triple double quotes ("""
) is intentional: in multiline strings, a string surrounded by triple double quotes respects new lines, and you don't need to escape double or single quotes inside the string.
data json:"""[{"amount": 1152,"accountId": 12},{"amount": 709,"accountId": 96}]"""
Query result:
1,152
12
709
96
describe
Describes the on-read schema extraction definition for a given data object. It returns the specified fields and their consecutive datatypes. The on-read schema extraction in Grail ensures that every record returned by querying the data of a data object via the fetch
command will contain at least those fields.
Known fields: Fields specified for a data object and returned by the describe
command or by a DQL statement.
Unknown/dynamic fields: Any ingested field not part of the on-read schema extraction definition for a given data object. The field name and datatype are derived at runtime when using a field within a DQL statement.
Syntax
describe dataObject
Basic example
Example: Describe business events
The following example uses the describe
command to retrieve information about all known fields for the bizevents
data object.
describe bizevents
Query result:
dt.system.table
[string]
dt.system.environment
[string]
dt.system.bucket
[string]
dt.system.segment_id
[string]
timestamp
[timestamp]
dt.system.sampling_ratio
[long]
fetch
Loads data from the specified resource.
Syntax
fetch dataObject [,timeframe:] [,from:] [,to:] [,samplingRatio:] [,scanLimitGBytes:]
Basic examples
Example 1: Query logs
Here is an example of the fetch
command in its simplest form.
fetch logs
Relative query timeframes
All duration literals valid for the duration data type are applicable for the from:
and to:
parameters.
This example with relative time ranges uses DQL time literals to query logs from the last 25 minutes:
-
On the UI level: in the timeframe selector in the upper-right corner:
- To choose one of the existing values (for example last 72 hours or last 365 days), select Presets
- To create your own timeframe value, select Custom
- To select the last 2 hours, select Recent
Example 2: Query relative timeframe
This example with relative time ranges uses DQL's time literals to set the time frame to query logs with the optional from
and to
parameters.
fetch logs, from: -24h, to: -2h
Example 3: Query with absolute timeframe
You can also use absolute time ranges with the timeframe
parameter.
fetch logs, timeframe: "2021-10-20T00:00:00Z/2021-10-28T12:00:00Z"
Sampling
Currently, to improve query performance, sampling is applicable for Log data within the initial fetch
pipeline stage. Sampling happens vertically across the data, resulting in the selection of a subset of Log records, according to the specified, optional samplingRatio
parameter.
The applicable value ranges for sampling are:
- 1: Default value, resulting in no applied sampling.
- 10
- 100
- 1000
- 10000
Depending on the specified value, 1/<samplingRatio>
of available raw Log records are returned.
The selected samplingRatio
is reported in the query result for each record through dt.system.sampling_ratio
, which is a hidden field. To see the hidden field, you need to select it via the fields
command.
Sampling is non-deterministic, and will return a different result set with each query run. Also, all the following commands will work based on the sampled set of input data, yielding unprecise aggregates.
Furthermore, result sets may vary greatly with different samplingRatio
values. This is the nature of sampling, as a high sampling ratio is more likely to leave out low-frequency logs. For example, if you had one ERROR
log among millions of INFO
logs, filter loglevel == "ERROR"
would very likely return an empty result set for any sampled data.
Example 4: Sampling ratio
The following example estimates the occurrences of ERROR logs across the last 7 days.
- The
fetch
command'ssamplingRatio
parameter defines the sampling ratio. - The
summarize
command, combined with thecountIf
function, counts only error logs. - You need to multiply the count with the sampling ratio to get an estimation.
fetch logs, from: -7d, samplingRatio: 100| summarize c = countIf(loglevel == "ERROR") * takeAny(dt.system.sampling_ratio)
Read data limit
The optional scanLimitGBytes
parameter controls the amount of uncompressed data to be read by the fetch
stage. The default value is 500GB
unless specified otherwise. If set to -1
, all data available in the query time range is analyzed.