These examples illustrate how to build powerful and flexible health dashboards by using DQL to slice and dice all Davis reported problems and events.
Davis problems represent results that originate from the Davis root-cause analysis runs. In Grail, Davis problems and their updates are stored as Grail events.
Davis events represent raw events that originate from various anomaly detectors within Dynatrace or within the OneAgent. Examples here are OneAgent-detected CPU saturation events or high garbage collection time events.
dt.davis.problems
.event.id
holds the unique problem ID, which is stable across all refreshes and updates that Davis reports for the same problem.fetch dt.davis.problems, from:now()-24h, to:now()| summarize {problemCount = countDistinct(event.id)}
Query result
problemCount
415
dt.davis.problems
.event.id
field, which contains the problem ID.ACTIVE
. To do this, the DQL command takeLast
of the field event.status
receives the last state.fetch dt.davis.problems| filter event.status == "ACTIVE"| summarize {activeEvents = countDistinct(event.id)}
Query result
activeProblems
15
fetch dt.davis.problems, from:now()-7d| makeTimeseries count(), time:{timestamp}
Query result
timeframe
interval
count
start: 22/11/2023, 11:00 end: 29/11/2023, 12:00
6 h
1.000, 4.000, 1.000, null, 1.000, 3.000, null, null, 3.000, 4.000
dt.davis.problems
.fetch dt.davis.problems| expand affected_entity_ids| summarize by:{affected_entity_ids}, count = countDistinct(display_id)| sort count, direction:"descending"| limit 10
Query result
affected_entity_ids
count
HOST-A9449CACDE12B2BF
10
SERVICE-5624DD59D74FF453
5
PROCESS_GROUP_INSTANCE-3184C659684130C7
3
A join with entity attributes is performed with the goal to filter all problems with a given host name.
dt.davis.problems
.affected_entity_ids
field.host.
: host.id
and host.name
.myhost
.fetch dt.davis.problems| expand affected_entity_ids| lookup sourceField:affected_entity_ids, lookupField:id, prefix:"host.", [fetch dt.entity.host | fields id, name = entity.name]| filter host.name == "myhost"| limit 3
Query result
timestamp
affected_entity_ids
host.id
host.name
display_id
5/31/2023, 1:31:39 PM
HOST-27D70086952122CF
HOST-27D70086952122CF
myhost
P-23054243
A join with entity attributes is performed with the goal to filter all problems with a given host name.
dt.davis.problems
.fetch dt.davis.problems| filter display_id == "P-24051200"
Query result
timestamp
affected_entity_ids
host.id
host.name
display_id
5/31/2023, 1:31:39 PM
HOST-27D70086952122CF
HOST-27D70086952122CF
myhost
P-23053506
Fetch all active problems that were not marked as duplicates. Because the duplicate flag appears during the lifecycle of a problem, the update events need to be sorted by timestamp and then summarized by taking the last state of the duplicate and status fields. Only after sorting them by timestamp is it possible to correctly apply the filter.
dt.davis.problems
.fetch dt.davis.problems| filter event.status == "ACTIVE" and not dt.davis.is_duplicate == "true"
Query result
display_id
status
id
duplicate
P-230910385
ACTIVE
P-230910385
false
This example shows how to calculate the mean time necessary to resolve all the reported problems by summarizing the delta between start and end of each problem over time.
dt.davis.problems
.fetch dt.davis.problems, from:now()-7d| filter event.status == "CLOSED"| filter dt.davis.is_frequent_event == false and dt.davis.is_duplicate == false and maintenance.is_under_maintenance == false| makeTimeseries `AVG Problem duration in hours` = avg(toLong(resolved_problem_duration)/3600000000000.0), time:event.end
This example shows how to query data for a chart of the concurrently open problems over time by filling all the resolution gaps with the spread
command.
dt.davis.problems
.spread
command.fetch dt.davis.problems| makeTimeseries count = count(), spread: timeframe(from: event.start, to: coalesce(event.end, now()))
dt.davis.events
for the last 7 days.fetch dt.davis.events, from:now()-7d, to:now()| filter event.kind == "DAVIS_EVENT"| filter event.type == "OSI_HIGH_CPU" or event.type == "OSI_HIGH_MEMORY"| summarize count = count(), by: {`60m interval` = bin(timestamp, 60m)}
Query result
60min interval
count
5/25/2023, 3:00 PM
146
5/25/2023, 4:00 PM
312
5/25/2023, 5:00 PM
201