Latest Dynatrace
The Site Reliability Guardian is a Dynatrace app that automates change impact analysis to validate service availability, performance, and capacity objectives across various systems. It enables DevOps platform engineers to make the right release decisions and empowers SREs to apply Service-Level Objectives (SLOs) for their critical services.
Go through the following process to learn using Site Reliability Guardian:
Site Reliability Guardian is based on the following concepts:
A guardian is the grouping of objectives. It is built around a set of entities reflecting a service or application you want to safeguard.
A guardian provides you with a default automation workflow that performs the objective validation. As a result, a guardian always represents the latest validation result derived from the objectives.
You can create a maximum of 1000 guardians.
Objectives are means for measuring the performance, availability, capacity, and security of your services. Objectives are measured by indicators. You can define an objective for your guardian that is validated on demand or automatically.
You can create a maximum of 50 objectives for each guardian.
An indicator is a value against which the warning and failure thresholds are checked using a comparison operator. To retrieve an indicator value, use DQL or reference an existing SLO.
The static warning and failure thresholds determine whether the measured value of the indicator meets the objective, is close to violating the objective, or violates the objective.
Warning and failure are optional; objective validation can vary:
Auto-adaptive thresholds are dynamic limits that adjust over time based on previous validations. If an objective changes its behavior, the threshold adapts automatically.
The comparison operator defines whether the objective is met: the indicator is less than or equal to (A lower value is good for my result), or it is greater than or equal to (A higher value is good for my result), the warning and failure threshold.
To organize your guardians, you can assign tags to them. Tags use the key:value
format, with the value being optional.
To assign a tag to your guardian, either specify it in the Add tags to your guardian section during guardian creation or add the tag later in edit mode.
To filter the list of all guardians by a tag, type the tag in the Search by name or tag field—the page automatically updates to show only guardians with matching tags.
This DQL shows you the first guardian.validation.objective
business event with a specific guardian ID and parses the guardian tags field to extract a specific tag value from the event JSON.
fetch bizevents |filter event.type == "guardian.validation.objective" AND guardian.id == "vu9U3hXa3q0AAAABADFhcHA6ZHluYXRyYWNlLnNpdGUucmVsaWFiaWxpdHkuZ3VhcmRpYW46Z3VhcmRpYW5zAAZ0ZW5hbnQABnRlbmFudAAkMWNiZDVkYWYtZThhNi0zMDkxLWFkOGQtMmU5NDNmNWJmZWJmvu9U3hXa3q0" |limit 1 |parse guardian.tags, "JSON:parsed_guardian_tags"
This DQL shows you all guardian.validation.finished
business events from guardians tagged as tagkey:my-tagged-guardian
.
fetch bizevents| filter event.type == "guardian.validation.finished"| expand guardian.tags| filter contains(guardian.tags, "my-tagged-guardian")
You can automate the execution of a guardian via Workflows, tying guardian execution to an event.
To add a guardian action to an existing workflow
Site Reliability Guardian
in the Choose action panel.event()
expression to extract the timeframe from the triggering event.You can create a new workflow by selecting Automate on the top right of the guardian page. When you create a workflow this way, the following parameters are configured, but be sure to adapt them as needed.
tag.service == "carts" AND tag.stage == "production"
.The guardian action generates the following output and passes it to the subsequent actions of the workflow.
Parameter
Description
guardian_id
The ID of the validated guardian
guardian_name
The name of the validated guardian
guardian_tags
An array of tags assigned to the validated guardian
execution_context
The execution context property of the trigger, if it was set
validation_id
The ID of all events generated by the validation
validation_url
The URL with the full validation details
validation_status
The status of the validation, indicating the overall result. The following values are possible:
info
pass
warning
fail
error
validation_summary
The number of objectives for each status
To learn more about workflows for a guardian, open the help menu in the upper-right corner of a guardian and select Get started with Automation.
If a workflow is created, your guardian is validated automatically. You can also perform the validation manually.
By default, the overview page shows validations for the last seven days. You can view older results by opening a guardian and selecting a different timeframe.
The event subscriptions in the workflow define when the validation of a guardian has triggered automatically.
You can perform a validation of a guardian by selecting the Validate button on the overview screen or within the validation details screen.
For each objective, the validation returns the derived value and classification. The severity goes from the highest (1) to the lowest (5).
After the validation of each objective is done, the guarding uses the most severe of individual validations as the overall validation result. Examples of this result usage include:
Automated change impact analysis for your deployment and release processes