Davis DQL examples
powered by Grail
These examples help illustrate how to build powerful and flexible health dashboards by using DQL to slice and dice all Davis reported problems and events.
Basic Davis problem examples
Davis problems represent results that originate from the Davis root-cause analysis runs. Within Grail Davis problems along with their updates are stored as Grail events.
- Problem example 1
Count the total number of problems in the last 24 hours. - Problem example 2
Count the current number of active problems. - Problem example 3
Chart the number of problems in the last 7 days to identify a trend within your environment stability. - Problem example 4
Identify the top 10 problem-affected entities within your environment. - Problem example 5
Join entity attributes with detected problems and apply a name filter. - Problem example 6
Load the last state of a given problem. - Problem example 7
Load all active problems and exclude all those that are marked as duplicates. - Problem example 8
Mean time to resolve for problems over time
Basic Davis event examples
Davis events represent raw events that originate from various anomaly detectors within Dynatrace or within the OneAgent. Examples here are OneAgent detected CPU saturation events or High Garbage Collection Time events.
- Davis event example 1
Chart the number of process restart events in the last 7 days.
Example 1: Count the total number of problems in the last 24 hours
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. - As Davis problems also include regular refresh events, the summarize DQL command is used to group by unique
event.id
before counting the total number. - The
event.id
holds the unique problem id that is kept stable across all refreshes and updates that Davis reports for the same problem.
1fetch events, from:now()-24h, to:now()2| filter event.kind == "DAVIS_PROBLEM"3| summarize {problemCount = countDistinct(event.id)}
Query result
problemCount |
---|
415 |
Example 2: Count the current number of active problems
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. As Davis problems also include regular refresh events, the query first sorts all refresh events by their timestamp.
- The result is grouped by the unique
event.id
field that contains the problem id. - The DQL command
takeLast
of the fieldevent.status
receives the last state with the purpose to filter out all those problems that are not in stateACTIVE
anymore.
1fetch events2| filter event.kind == "DAVIS_PROBLEM"3| sort timestamp, direction:"ascending"4| summarize {event.status = takeLast(event.status)}, by:{ event.id }5| filter event.status == "ACTIVE"6| summarize {activeEvents = countDistinct(event.id)}
Query result
activeProblems |
---|
15 |
Example 3: Chart the number of problems in the last 7 days to identify a trend within your environment stability
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. The count is done in a resolution of 6 hour bins.
1fetch events, from:now()-7d2| filter event.kind == "DAVIS_PROBLEM"3| summarize {problem = takeMax(4 record(timestamp,5 resolved_problem_duration,6 event.end,7 dt.davis.is_duplicate,8 event.status,9 maintenance.is_under_maintenance10 ,dt.davis.is_frequent_event)) },11by:{display_id}12| fieldsFlatten problem13| filter problem.event.status == "CLOSED"14| filter problem.dt.davis.is_frequent_event == false and problem.dt.davis.is_duplicate == false and problem.maintenance.is_under_maintenance == false15| makeTimeseries count(), interval:6h, time:problem.timestamp
Query result
timeframe | interval | count |
---|---|---|
start: 22/11/2023, 11:00 end: 29/11/2023, 12:00 | 6 h | 1.000, 4.000, 1.000, null, 1.000, 3.000, null, null, 3.000, 4.000 |
Example 4: Identify the top 10 problem-affected entities within your environment
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. The arrays field containing all affected entity ids is expanded into individual fields.
Count all unique problems grouped by the affected entity ids.
Sort by that problem count.
Return the top 10 entity ids.
1fetch events2| filter event.kind == "DAVIS_PROBLEM"3| expand affected_entity_ids4| summarize by:{affected_entity_ids}, count = countDistinct(display_id)5| sort count, direction:"descending"6| limit 10
Query result
affected_entity_ids | count |
---|---|
HOST-A9449CACDE12B2BF | 10 |
SERVICE-5624DD59D74FF453 | 5 |
PROCESS_GROUP_INSTANCE-3184C659684130C7 | 3 |
Example 5: Fetch all problems for a host with the name "myhost"
A join with entity attributes is performed with the goal to filter all problems with a given host name.
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. The arrays field containing all affected entity ids is expanded into individual fields.
- A topology and entity lookup is done on the
affected_entity_ids
field. - The resulting records are enriched with two entity fields that are prefixed with
host.
, namelyhost.id
andhost.name
. - A filter for the host name
myhost
is applied.
1fetch events2| filter event.kind == "DAVIS_PROBLEM"3| expand affected_entity_ids4| lookup sourceField:affected_entity_ids, lookupField:id, prefix:"host.", [5fetch dt.entity.host | fields id, name = entity.name6]7| filter host.name == "myhost"8| limit 3
Query result
timestamp | affected_entity_ids | host.id | host.name | display_id |
---|---|---|---|---|
5/31/2023, 1:31:39 PM | HOST-27D70086952122CF | HOST-27D70086952122CF | myhost | P-23054243 |
Example 6: Load the last state of a given problem
A join with entity attributes is performed with the goal to filter all problems with a given host name.
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. Filter by the unique display identifier of the problem.
Sort all problem update events by their timestamp in descending order.
Return the last one.
1fetch events2| filter event.kind == "DAVIS_PROBLEM"3| filter display_id == "P-23053506"4| sort timestamp desc5| limit 1
Query result
timestamp | affected_entity_ids | host.id | host.name | display_id |
---|---|---|---|---|
5/31/2023, 1:31:39 PM | HOST-27D70086952122CF | HOST-27D70086952122CF | myhost | P-23053506 |
Example 7: Load all active problems and exclude all those that are marked as duplicates
Fetch all active problems that were not marked as duplicates. As the duplicate flag appears during the lifecycle of a problem, the update events need to be sorted by timestamp and then summarized by taking the last state of the duplicate and status fields. Only after sorting them by timestamp its possible to correctly apply the filter.
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. Sort by timestamp.
Take the last status and duplicate field states and group by the unique display_id field.
Filter out problems that are marked as duplicates.
Filter out problems that were closed already.
1fetch events2| filter event.kind == "DAVIS_PROBLEM"3| sort timestamp desc4| summarize {status = takeFirst(event.status), id = takeFirst(display_id), duplicate = takeFirst(dt.davis.is_duplicate)}, by:{display_id}5| filter duplicate == false6| filter status == "ACTIVE"
Query result
display_id | status | id | duplicate |
---|---|---|---|
P-230910385 | ACTIVE | P-230910385 | false |
Example 8: Mean time to resolve for problems over time
This example shows how to calculate the mean time necessary to resolve all the reported problems by summarizing the delta between start and end of each problem over time.
- Davis detected problems are distinguished by the
event.kind=”DAVIS_PROBLEM”
field. Collects the last value of several problem fields such as is_duplicate or status.
Flattens the problem fields into the record.
Filter out all the closed problems.
Filter out all the frequent and duplicate problems.
Converts the values into a time series of averages over time.
1fetch events, from:now()-7d2| filter event.kind == "DAVIS_PROBLEM"3| summarize {problem=takeMax(record(timestamp,resolved_problem_duration,event.end,dt.davis.is_duplicate,event.status,maintenance.is_under_maintenance,dt.davis.is_frequent_event)) }, by:{display_id}4| fieldsFlatten problem5| filter problem.event.status == "CLOSED"6| filter problem.dt.davis.is_frequent_event == false and problem.dt.davis.is_duplicate == false and problem.maintenance.is_under_maintenance == false7| makeTimeseries `AVG Problem duration in hours` = avg(toLong(problem.resolved_problem_duration)/3600000000000.0), time:problem.event.end
Davis event example 1: Chart the number of CPU saturation and high-memory events in the last 7 days
- Davis raw events are distinguished by the
event.kind=”DAVIS_EVENT”
field. The count is done in a resolution of 60 minutes bins.
1fetch events, from:now()-7d, to:now()2| filter event.kind == "DAVIS_EVENT"3| filter event.type == "OSI_HIGH_CPU" or event.type == "OSI_HIGH_MEMORY"4| summarize count = count(), by: {`60m interval` = bin(timestamp, 60m)}
Query result
60min interval | count |
---|---|
5/25/2023, 3:00 PM | 146 |
5/25/2023, 4:00 PM | 312 |
5/25/2023, 5:00 PM | 201 |