Site Reliability Guardian as code
Dynatrace supports two approaches to managing a guardian using configuration as code:
Dynatrace Configuration as Code via Monaco
We recommend separating the configuration into three different files.
configs.yaml
: general configuration settings centralized in one place.guardian.json
: configuration for the guardian and its objectives.workflow.json
: configuration for a guardian workflow to automate the validation process.
General configuration settings
The general configuration settings are passed on to the guardian or workflow configuration. This allows quick adoptions of the entire configuration without knowing the details of the guardian or workflow.
We recommend including two separate entries—for the guardian configuration and for the workflow configuration. The code below shows an implementation example.
configs:- id: easytradeguardianconfig:name: easytradeguardiantemplate: guardian.jsonparameters:thresholdMemoryRequests:type: valuevalue:target: 20warning: 18skip: falsetype:settings:schema: app:dynatrace.site.reliability.guardian:guardiansschemaVersion: 1.1.0scope: environment- id: easytradeworkflowconfig:name: Automated validationtemplate: workflow.jsonparameters:guardianid:configId: easytradeguardianconfigType: app:dynatrace.site.reliability.guardian:guardiansproperty: idtype: referenceeventFilters:type: valuevalue:application: easytradestage: productionskip: falsetype:automation:resource: workflow
-
The configuration for the guardian defines the guardian name, the referenced guardian template (
guardian.json
), and parameters for the different objectives. This example defines one objective with a target threshold of20
and a warning threshold of18
. -
The workflow configuration defines the workflow name, the referenced workflow template (
workflow.json
), and workflow action and event filter settings. In this example, the workflow is subscribed to events that contain the key-value pairsapplication: esaytrade
andstage: production
. Besides, the guardian with the nameeasytradeguardian
is referenced as being executed by the workflow action.
Guardian configuration
The configuration for a guardian consists of the general guardian properties followed by a list of objectives. This is shown in the following example.
{"name": "Easy Trade K8s workload","description": "Safeguard your Kubernetes environment with dedicated resource utilization objectives for a Kubernetes workload","tags": ["stage:production","application:easytrade","deployment:k8s-workload"],"objectives": [{"name": "Memory requests","description": "The requested memory by the sum of the memory requests of all containers in a pod","objectiveType": "DQL","dqlQuery": "timeseries val = max(dt.kubernetes.workload.requests_memory), filter: in(dt.entity.cloud_application, \"CLOUD_APPLICATION-PLACEHOLDER\")\n| fields max = arrayMax(val)","comparisonOperator": "LESS_THAN_OR_EQUAL","target": {{ .thresholdMemoryRequests.target }},"warning": {{ .thresholdMemoryRequests.warning }}}]}
Guardian properties
These are the name, description, and a list of tags.
Objective definition
Each objective consists of
name
: The name of the objective.description
: A description of the objectiveobjectiveType
: Defines if the objective isDQL
orREFERENCE_SLO
based.dqlQuery
: The DQL query for a DQL-based objective.referenceSlo
: The function name of a SLO starting withfunc:slo
.comparisonOperator
: Defines the operator of the objective to protect against an increase or decrease.target
: The failure threshold of the objective. In the above example, the failure threshold is provided by theconfig.yaml
file.warning
: The warning threshold of the objective. In the above example, the warning threshold is provided by theconfig.yaml
file.
Workflow configuration
The minimal configuration for a guardian workflow consists of two main parts: the workflow trigger and the workflow action. This is shown in the following example:
{"description": "","isPrivate": true,"labels": {},"taskDefaults": {},"title": "{{.name}}","trigger": {"eventTrigger": {"filterQuery": "application == \"{{ .eventFilters.application }}\" and\nstage == \"{{ .eventFilters.stage }}\"","isActive": true,"triggerConfiguration": {"type": "event","value": {"eventType": "bizevents","query": "application == \"{{ .eventFilters.application }}\" and\nstage == \"{{ .eventFilters.stage }}\""}},"uniqueExpression": null}},"triggerType": "Event","tasks": {"run_validation": {"action": "dynatrace.site.reliability.guardian:validate-guardian-action","description": "Automation action to start a guardian validation","input": {"executionId": "{{`{{`}} execution().id {{`}}`}}","objectId": "{{.guardianid}}","timeframeInputType": "timeframeSelector","timeframeSelector": {"from": "now-30m","to": "now"}},"name": "run_validation","position": {"x": 0,"y": 1},"predecessors": []}},"usages": [],"version": 2}
-
Workflow trigger: In the example, the workflow triggered by an event. The event filter is configured to look for business events where the properties
application
andstage
match the values defined in theconfig.yaml
. Consequently, the workflow execution triggers by a business event containiningapplication: easytravel
andstage: production
. -
Workflow action: In the example, the workflow action to execute a guardian validation refers to the guardian specified in the
config.yaml
—theeasytradeguardian
guardian. Besides, it defines the validation timeframe—the last 30 minutes.
Configuration via Terraform
To leverage the Dynatrace Terraform Provider for managing a guardian and its workflow, you need two resources:
- dynatrace_site_reliability_guardian—a resource for a guardian and its objectives
- dynatrace_automation_workflow—a resource for the guardian workflow to automate the validation process.