The data command generates sample data during query runtime. It is intended to test and document query scenarios based on a small, exemplary dataset.
Based on an input according to the DQL record datatype, or passing a valid JSON string, a tabular list of records is returned.
The data command is a starting command which can be used without a pipeline input.
data [ records ] [, json: json_string ]
Parameter
Type
Description
Required
records
record expressions
A list of record expressions. Either records or JSON has to be specified.
optional
json
string
A string that defines either a single JSON object or a JSON array. Either records or JSON has to be specified.
optional
In this example, the data
command generates three heterogeneous records.
data record(a = "DQL", b = 1, c = 0),record(a = "Dynatrace Query Language", b = 2.9, e = "1"),record()
Query result:
DQL
1
0
Dynatrace Query Language
2.9
1
The following example generates records based on a JSON input.
The use of triple double quotes ("""
) is intentional: in multiline strings, a string surrounded by triple double quotes respects new lines, and you don't need to escape double or single quotes inside the string.
data json:"""[{"amount": 1152,"accountId": 12},{"amount": 709,"accountId": 96}]"""
Query result:
1,152
12
709
96
Describes the on-read schema extraction definition for a given data object. It returns the specified fields and their consecutive datatypes. The on-read schema extraction in Grail ensures that every record returned by querying the data of a data object via the fetch
command will contain at least those fields.
Known fields: Fields specified for a data object and returned by the describe
command or by a DQL statement.
Unknown/dynamic fields: Any ingested field not part of the on-read schema extraction definition for a given data object. The field name and datatype are derived at runtime when using a field within a DQL statement.
describe dataObject
The following example uses the describe
command to retrieve information about all known fields for the bizevents
data object.
describe bizevents
Query result:
dt.system.table
[string]
dt.system.environment
[string]
dt.system.bucket
[string]
dt.system.segment_id
[string]
timestamp
[timestamp]
dt.system.sampling_ratio
[long]
Loads data from the specified resource.
fetch dataObject [,timeframe:] [,from:] [,to:] [,samplingRatio:] [,scanLimitGBytes:]
Here is an example of the fetch
command in its simplest form.
fetch logs
All duration literals valid for the duration data type are applicable for the from:
and to:
parameters.
This example with relative time ranges uses DQL time literals to query logs from the last 25 minutes:
On the UI level: in the timeframe selector in the upper-right corner:
This example with relative time ranges uses DQL's time literals to set the time frame to query logs with the optional from
and to
parameters.
fetch logs, from: -24h, to: -2h
You can also use absolute time ranges with the timeframe
parameter.
fetch logs, timeframe: "2021-10-20T00:00:00Z/2021-10-28T12:00:00Z"
Currently, to improve query performance, sampling is applicable for Log data within the initial fetch
pipeline stage. Sampling happens vertically across the data, resulting in the selection of a subset of Log records, according to the specified, optional samplingRatio
parameter.
The applicable value ranges for sampling are:
Depending on the specified value, 1/<samplingRatio>
of available raw Log records are returned.
The selected samplingRatio
is reported in the query result for each record through dt.system.sampling_ratio
, which is a hidden field. To see the hidden field, you need to select it via the fields
command.
Sampling is non-deterministic, and will return a different result set with each query run. Also, all the following commands will work based on the sampled set of input data, yielding unprecise aggregates.
Furthermore, result sets may vary greatly with different samplingRatio
values. This is the nature of sampling, as a high sampling ratio is more likely to leave out low-frequency logs. For example, if you had one ERROR
log among millions of INFO
logs, filter loglevel == "ERROR"
would very likely return an empty result set for any sampled data.
The following example estimates the occurrences of ERROR logs across the last 7 days.
fetch
command's samplingRatio
parameter defines the sampling ratio.summarize
command, combined with the countIf
function, counts only error logs.fetch logs, from: -7d, samplingRatio: 100| summarize c = countIf(loglevel == "ERROR") * takeAny(dt.system.sampling_ratio)
The optional scanLimitGBytes
parameter controls the amount of uncompressed data to be read by the fetch
stage. The default value is 500GB
unless specified otherwise. If set to -1
, all data available in the query time range is analyzed.