Example configuration of service-level objective definitions
Dynatrace offers a set of out-of-the-box SLOs for some of the primary monitoring domains that you can configure either in the SLO wizard or in the global SLO settings.
For a better understanding of the SLIs needed for these service-level objectives, see the configuration examples below.
The fundamental service-level availability is calculated by dividing the number of successful service calls (
builtin:service.errors.server.successCount ) by the total number of service calls (
Entity selector1type("SERVICE"),entityName("My service")
A service performance SLO represents the percentage of "fast" service calls from the total number of service calls in a timeslot, where "fast" is defined with a custom condition.
The example below shows how to define a metric expression that counts the fast service calls, and how to define your SLO based on that metric expression in Dynatrace.
Using the following transformations, the metric expression returns values that have a response time under a defined threshold.
Trasformation Scope Info
Aggregates values different than
nulland ignores the rest.
Depending on the use case, multiple aggregations can be used, for example,
avg(to aggregate the values) and
percentile (90)(to remove outliers).
Separates the metric's individual data points into a certain number of timeslots over the timeframe, based on the value
goodof the metric dimension
To improve SLO precision, reduce the timeslot extent by querying a shorter timeframe.
lt()condition changes the metric unit for the response latency threshold value to microseconds.
The required metric unit for service performance SLOs is microseconds.
nullvalues in the payload with the specified value (
Using a custom-calculated metric as a nominator improves the precision of the performance SLO. We recommend using the muted request option when combining calculated service metrics with built-in metrics, as the built-in metric applies it by default.
Service-method availability is calculated by dividing the number of successful key request service calls (
builtin:service.keyRequest.errors.server.successCount) by the total number of key request service calls (
builtin:service.keyRequest.count.server). It uses the
type("SERVICE_METHOD") SLO filter.
Example configuration with a filter for
<YOUR custom success metrics/filter service, endpoint, availability, and latency>/
This example shows how to define a service custom metric that counts the fast service calls, and how to define your SLO based on that custom metric in Dynatrace.
Go to any of your service pages, review its typical performance in milliseconds, and then navigate to the multidimensional analysis page.
On the multidimensional analysis page, select Request count metric and define a condition on any of the service-call properties. In this example, we define a condition on the response time, which should be below 1,300 milliseconds to count as a fast call for this selected service.
After you've decided on a certain condition, select Create metric. Define your own unique metric identifier for that metric to use this metric for charting, alerting, and your SLO.
For example, a newly created metric
fastcreditcardrequestsresults in a unique metric ID
You can chart the total number of service requests for that service in comparison with your fast service requests.
Check the entity selector filter on your selected service (
CreditCardValidation) or you will get the total request count for all your services.
Example entity selector:
The result is the final SLO status shown in the list of SLOs.
Dynatrace offers expertise in terms of measuring the real user experience of delivered services. Dynatrace metrics such as the Apdex (Application Performance Index) or the User experience score can be used within an SLO definition.
Apdex defines a performance standard to divide your application users into three groups: SATISFIED, TOLERATING, and FRUSTRATED.
For example, as an SLO goal for your application, you can specify that you want 90% of all your users within the SATISFIED category.
This SLO is calculated by dividing the number of users that are in the SATISFIED category (
builtin:apps.web.actionCount.category:filter(eq(Apdex category,SATISFIED)):splitBy()) by the total number of users that are using a web or mobile application (
Metric expression1(100)*(builtin:apps.web.actionCount.category:filter(eq("Apdex category",SATISFIED)):splitBy())/(builtin:apps.web.actionCount.category:splitBy())
Entity selector1type("APPLICATION"),entityName("My application")
Mobile crash-free users
One of the most important metrics for measuring the availability and reliability of your mobile app (iOS and Android) deployment is the percentage of
Crash free user rate. Therefore, the built-in metric used is
builtin:apps.other.crashFreeUsersRate.os. This metric measures the percentage of users that open and use a mobile application without experiencing a crash.
Entity selector1type("MOBILE-APPLICATION"),entityName("My mobile app")
A synthetic availability SLO represents the percentage of time a synthetic test was in available state or, alternatively, the percentage of successful tests to the number of total tests executed.
To define the time-based synthetic objective, the built-in metric used is
Optionally, to define the time-based availability that excludes maintenance windows, the built-in metric used is
Entity selector1type("SYNTHETIC_TEST"),entityName("My synth test")
For additional insights into SLO, check the Dynatrace University tutorial Getting started with SLOs in Dynatrace.