Correlation and join commands
append
Appends a given list of records by the records returned by a sub-query. It doesn't change the fields of the original records and does not remove duplicate records. The behavior of the append
command is similar to the SQL UNION ALL operation.
Syntax
append executionBlock
Basic example
Example: Append records
The following example appends a set of records to an existing set of records.
data record(a = 2),record(a = 7)| append [data record(a = 3),record(a = 7)]
Query result:
2
7
3
7
Practical example
Example: Garbage collection and warning logs
In this example, the append
command is used to combine metric and log data.
The query shows the garbage collection and the number of warning logs for a given host.
timeseries gctime = sum(dt.runtime.jvm.pgi.cpu_time_suspension),filter: dt.entity.host == "HOST-84D65AF93A3C185A"| fields total_gc = arraySum(gctime)| append [fetch logs| filter dt.entity.host == "HOST-84D65AF93A3C185A" and loglevel == "WARN"| summarize warnings = count()]
join
The join command merges the records of two tables and forms a new table by matching records from each table.
DQL offers three kinds of joins that each affect the rows in the resultant table in different ways:
- The inner join, which represents the default behavior of the DQL join, produces an output record whenever a record on the left side matches a record on the right side based on the join condition.
- The leftOuter join returns all the records from the left side and only matching records from the right side. In case of no match, the left side's record will remain.
- The outer join returns matched and unmatched records from either or both sides.
Syntax
join joinTable [, kind] , on: condition, … [, prefix] [, fields: { [field, …] }] [, executionOrder]
The join condition is formed from one or more criteria, where each criterion is an equality condition that links two fields, one from each side.
If the join field name is identical on the left and right sides, the join criterion is defined as follows:
.. | join [...] on:id
If the join field name differs between the left side and right side:
.. | join [...] on:left[dt.entity.host] == right[id]
The join command supports more complex join conditions by chaining multiple criteria in a comma-separated list.
.. | join [...] on:{left[service.id] == right[id], left[cloudType] == right[cloud] , dt.entity.host}
Each criterion is combined using AND boolean logic. This requires all criteria to be met in order to match records from both sides.
Parameters
Parameter
Type
Description
Required
joinTable
The sub-query producing the right side.
required
kind
Specifies the join kind. The default is inner
. Possible options are inner
, outer
and leftOuter
.
optional
on
The join condition.
required
prefix
The prefix to add to the joined fields. If not specified, the default prefix is right
.
optional
fields
A field from the subquery to add to the input.
optional
executionOrder
Defines which side of the join will be executed first. The allowed values are auto
, leftFirst
, rightFirst
. The default value is auto
.
optional
Basic examples
Example 1: Inner join
Returns all known fields from both tables, including the fields mentioned in the join condition. Only the matching records from both tables are returned.
data record(key = "a", value = 1),record(key = "b", value = 2),record(key = "c", value = 4)| join [data record(key = "b", amount = 10),record(key = "c", amount = 20),record(key = "c", amount = 40),record(key = "d", amount = 50)],on: { key }
Query result:
b
2
b
10
c
4
c
20
c
4
c
40
Example 2: Left outer join
Returns all known fields from both tables, including the fields mentioned in the join condition. All records from the left side plus matching records from the right side are returned.
data record(key = "a", value = 1),record(key = "b", value = 2),record(key = "c", value = 4)| join [data record(key = "b", amount = 10),record(key = "c", amount = 20),record(key = "c", amount = 40)],kind: leftOuter,on: { key }
Query result:
a
1
b
2
b
10
c
4
c
20
c
4
c
40
Example 3: Outer join
Returns the matching records from the left and right side as combined records and also the non matching records from both sides.
data record(key = "a", value = 1),record(key = "b", value = 2),record(key = "c", value = 4)| join [data record(key = "b", amount = 10),record(key = "c", amount = 20),record(key = "c", amount = 40),record(key = "d", amount = 50)],on: { key },kind: outer
Query result:
a
1
b
2
b
10
c
4
c
20
c
4
c
40
d
50
Practical example
Example: Ingested business events by host name
The following example shows the number of ingested business events by host name.
The host name is retrieved with the help of the join
command.
fetch bizevents| filter event.provider == "www.easytrade.com"| summarize count = count(),by: { dt.entity.host }| join [ fetch dt.entity.host ],on: { left[dt.entity.host] == right[id] },fields: { host.name = entity.name }| fields host.name, count
Usage and best practices
- Use the smallest dataset on the right side.
- Use the inner join if applicable.
- Select fields in the join subquery.
- Filter the right side's dataset in the join subquery.
Limits
In general, the right side's result size limit is 128 MB. If the limit is exceeded, the DQL query fails. To resolve this issue, reduce the right side's result set by filtering or aggregating the result set on the right side.
If the left side is expected to produce a smaller result size than the right side, set the executionOrder
parameter to leftFirst
. For inner and outer join, the size limit applies then to the left side. In case of an outerLeft join the result set from the left side, if executed first, can be used for additional filtering of the right side's result set to further reduce it.
Example: Changing join execution order
In this query, the left side produces only 100 rows from bizevents
. However, the logs
queried on the right side can potentially generate a large number of rows, which might exceed the size limit of a join table and cause the query to fail. By utilizing the leftFirst
approach, we execute the smaller part first and then use it for the join operation.
fetch bizevents| filter event.provider == "www.easytrade.com"| limit 100| join [ fetch logs ],kind: leftOuter,on: { trace_id },executionOrder: leftFirst,fields: { log.timestamp = timestamp, content }| fieldsAdd timestamp.diff = timestamp - log.timestamp
joinNested
Adds matching results from the sub-query as an array of nested records.
The joinNested
command functions as a variant of the join
command, specifically utilizing the leftOuter
type. Instead of replicating matching records, it incorporates the list of matching records into a newly added field on the left side.
Syntax
joinNested alias = joinTable [, on: condition, …] [, fields: { [field, …] }] [, executionOrder:]
Parameters
Parameter
Type
Description
Required
joinTable
The sub-query producing the right side.
required
on
Records must match this condition in order to be joined.
required
fields
The fields from the sub-query to add to the source.
optional
executionOrder
Defines which side of the join will be executed first. The allowed values are auto
, leftFirst
, rightFirst
. The default value is auto
.
optional
Basic examples
Example 1: Simple nested join
The joinNested
command adds a new nestedRecords
field to every record of the left side of the join. A nestedRecords
field is an array of matching records of the left side of the join.
data record(key = "a", value = 1),record(key = "b", value = 2),record(key = "c", value = 4)| joinNested nestedRecords = [data record(key = "b", amount = 10),record(key = "c", amount = 20),record(key = "c", amount = 40),record(key = "d", amount = 50)],on: { key }
Query result:
a
1
b
2
b
amount: 10
]c
4
c
amount: 20
, key: c
amount: 40
]Example 2: Nested join with custom fields
The joinNested
command adds only those fields, that are specified in the fields
parameter, into the nestedRecords
field.
data record(key = "a", value = 1),record(key = "b", value = 2),record(key = "c", value = 4)| joinNested nestedRecords = [data record(key = "b", amount = 10),record(key = "c", amount = 20),record(key = "c", amount = 40),record(key = "d", amount = 50)],on: { key },fields: { amount }
Query result:
a
1
b
2
10
]c
4
20
, amount: 20
]Practical example
Example: Process groups and their services
In the following example, the joinNested
command adds the field services
to every record. That new field contains a list of all service IDs and service names running on that process group.
fetch dt.entity.process_group| joinNested services = [fetch dt.entity.service| fieldsAdd dt.entity.process_group = runs_on[dt.entity.process_group]],on: { left[id] == right[dt.entity.process_group] },fields: { id, name = entity.name }
Limits
The same limits as described for the join
command apply.
lookup
Adds (joins) fields from a subquery (the lookup table
) to the source table by finding a match between a field in the source table (sourceField
) and the lookup table (lookupField
). In case the lookup command finds more than one match in the lookup table, only the top result is retrieved (the top row).
Syntax
lookup lookupTable [, sourceField] [, lookupField] [, prefix] [, fields: { [field, …] }] [, executionOrder]
Parameters
Parameter
Type
Description
Required
sourceField
The matching field on the source table (left side).
required
lookupField
The matching field on the lookup table (right side).
required
prefix
The prefix to add to the joined fields. If not specified, the default prefix is 'lookup'.
optional
fields
A field from the subquery to add to the input.
optional
executionOrder
Defines which side of the join will be executed first. The allowed values are auto
, leftFirst
, rightFirst
. The default value is auto
.
optional
Basic example
Example: Lookup names to records
The following example uses the lookup
command to join names from the subquery to the source table.
data record(key = "a", value = 1),record(key = "b", value = 2),record(key = "c", value = 4)| lookup [data record(key = "b", amount = 10),record(key = "c", amount = 20),record(key = "c", amount = 40),record(key = "d", amount = 50)],sourceField: key,lookupField: key,prefix: "lookuptable."
Query result:
a
1
b
2
b
10
c
4
c
20
Practical examples
Example 1: Lookup cluster name to log
The following example uses the lookup
command to query the Kubernetes cluster name and add it to logs.
fetch logs| filter isNotNull(dt.entity.kubernetes_cluster)| lookup [ fetch dt.entity.kubernetes_cluster ],sourceField: dt.entity.kubernetes_cluster,lookupField: id,fields: { cluster.name = entity.name }
Example 2: Nested field as join condition
You can include nested fields in your queries as sourceField
or lookupField
. Nested fields can only come from inside a record and not from an array.
fetch dt.entity.service_instance| lookup [ fetch dt.entity.host ],sourceField: runs_on[dt.entity.host],lookupField: id,fields: { host.name = entity.name }
Lookup as a function
You can also use lookup
as a function to perform a lookup. In contrast to the lookup
command, the lookup
function nests all included fields as a record.
For further details, go to the lookup function documentation.
Limits
The same limits as described for the join
command apply.