Service-level objective examples

Service-Level Objectives offers a set of service-level objective (SLO) examples that you can use to create your service-level objectives using DQL.

We also offer a set of pre-configured SLO templates. For more information on the SLO templates, see Service-level objective templates.

See the SLO configuration examples to understand some of the possibilities for service-level indicators (SLIs).

Log-pattern based SLO

This SLI measures the proportion of the log lines with loglevels INFO and WARNING against all log lines.

Details of the example

Data source: logs
Entity scope: app-id

SLI DQL query:

fetch logs, scanLimitGBytes: -1
| fieldsAdd failed = coalesce(if(loglevel == "INFO" OR loglevel =="WARNING", 1), 0)
| makeTimeseries {failed = avg(failed), total = count()}, by: {dt.app.id}
| fieldsAdd sli =  100 - ((toDouble(failed[]) / toDouble(total[])) * 100)

Performance by service

This SLI measures the duration of service requests based on spans.

Details of the example

Data source: spans/traces, responsetimes/duration
Entity scope: services

SLI DQL query:

fetch spans
| filter dt.entity.service == "SERVICE-53B3E0D705DB0194"
| makeTimeseries{total = count(),good = countIf(duration <= 150ms)}, by:{name = entityName(dt.entity.service)}
| fieldsAdd sli = 100 * (good[]/total[])
| fieldsRemove total, good

Performance by service endpoint

This SLI measures a selected endpoint's latency (performance) as the proportion of service requests that are served faster than a defined response time in milliseconds, based on spans.

Details of the example

Data source: spans/traces, responsetimes/duration
Entity scope: services, endpoint

SLI DQL query:

fetch spans
| filter endpoint.name == "/Booking"
| makeTimeseries {total = count(), good = countIf(duration < 150ms)}, by:{endpoint.name}
| fieldsAdd sli = 100 * (good[]/total[])
| fieldsRemove total, good

SLO for release validations: `checkoutservice`

This SLI measures the proportion of successful guardian release validations.

SLO for release validations: checkoutservice screen

Details of the template

Data source: bizevents (guardian validations)
Entity scope: guardians

SLI DQL query:

fetch bizevents
  | filter event.type == "guardian.validation.finished"
  | parse `validation.summary`, """JSON{ INT: "pass",INT: "warning", INT: "fail", INT: "error", INT: "info" }:result"""
  | fieldsAdd all = result[pass] +result[warning]+result[fail] + result[error] + result[info]
  | fieldsAdd nok = (result[fail] + result[error] + result[info])
  | makeTimeseries {all = sum(all), nok = sum(nok)}, by: {guardian.name}, interval: 10min
  | filter in(guardian.name,"Three golden signals (checkoutservice)")
  | fieldsAdd sli = 100 * ((all[]-nok[])/all[])
  | fieldsRemove all , nok

SLO for synthetic browser availability considering business hours

This SLI measures the proportion of successful browser monitor tests over time, only considering business hours (Monday–Friday, 9 AM–5 PM UTC+2).

Details of the example

Data source: metrics (timeseries); user input: timezone, business hours, work days
Entity scope: Synthetic browser test, Synthetic location

SLI DQL query:

timeseries {sli = avg(dt.synthetic.browser.availability), timestamp=start()}, by:{dt.entity.synthetic_test,dt.entity.synthetic_location}, interval:1min
| fieldsAdd entityName = entityName(dt.entity.synthetic_test)
| fieldsAdd locationName = entityName(dt.entity.synthetic_location)
| filter in(entityName, "Dynatrace website")
| fieldsAdd sli=if(getDayOfWeek(timestamp[])<6, sli[])
| fieldsAdd sli=if(getHour(timestamp[],timezone:"Europe/Bucharest")>=9, sli[])
| fieldsAdd sli=if(getHour(timestamp[],timezone:"Europe/Bucharest")<=17, sli[])

Service performance for services with a certain tag

This SLI measures the proportion of successful service requests, filtered for services with a particular tag, over time.

Details of the example

Data source: metrics (timeseries); tags
Entity scope: services

SLI DQL query:

timeseries total=avg(dt.service.request.response_time), default:0, by: { dt.entity.service }
| fieldsAdd tags=entityAttr(dt.entity.service, "tags")
| filter in(tags, "[Environment]DT_RELEASE_PRODUCT:easytravel")
| fieldsAdd high=iCollectArray(if(total[]> (1000 * 500), total[]))
| fieldsAdd low=iCollectArray(if(total[]<= (1000 * 500), total[]))
| fieldsAdd highRespTimes=iCollectArray(if(isNull(high[]),0,else:1))
| fieldsAdd lowRespTimes=iCollectArray(if(isNull(low[]),0,else:1))
| fieldsAdd entityName = entityName(dt.entity.service)
| fieldsAdd sli=100*(lowRespTimes[]/(lowRespTimes[]+highRespTimes[]))
| fieldsRemove total, high, low, highRespTimes, lowRespTimes, tags

Service availability for critical services (tagged Gold)

This SLI measures the proportion of successful service requests over time, considering only gold-tier tagged services.

Details of the example

Data source: metrics (timeseries); tag
Entity scope: services

SLI DQL query:

timeseries { total=sum(dt.service.request.count) ,failures=sum(dt.service.request.failure_count) }, by: { dt.entity.service }
| fieldsAdd tags=entityAttr(dt.entity.service, "tags")
| filter in(tags, "criticality:Gold")
| fieldsAdd entityName = entityName(dt.entity.service)
| fieldsAdd sli=(((total[]-failures[])/total[])*(100))
| fieldsRemove total, failures, tags