Site Reliability Guardian as code

Dynatrace supports two approaches to managing a guardian using configuration as code:

Dynatrace Configuration as Code via Monaco

We recommend separating the configuration into three different files.

  • configs.yaml: general configuration settings centralized in one place.
  • guardian.json: configuration for the guardian and its objectives.
  • workflow.json: configuration for a guardian workflow to automate the validation process.

General configuration settings

The general configuration settings are passed on to the guardian or workflow configuration. This allows quick adoptions of the entire configuration without knowing the details of the guardian or workflow.

We recommend including two separate entries—for the guardian configuration and for the workflow configuration. The code below shows an implementation example.

configs:
- id: easytradeguardian
config:
name: easytradeguardian
template: guardian.json
parameters:
thresholdMemoryRequests:
type: value
value:
target: 20
warning: 18
skip: false
type:
settings:
schema: app:dynatrace.site.reliability.guardian:guardians
schemaVersion: 1.3.2
scope: environment
- id: easytradeworkflow
config:
name: Automated validation
template: workflow.json
parameters:
guardianid:
configId: easytradeguardian
configType: app:dynatrace.site.reliability.guardian:guardians
property: id
type: reference
eventFilters:
type: value
value:
application: easytrade
stage: production
skip: false
type:
automation:
resource: workflow
  • The configuration for the guardian defines the guardian name, the referenced guardian template (guardian.json), and parameters for the different objectives. This example defines one objective with a target threshold of 20 and a warning threshold of 18.

  • The workflow configuration defines the workflow name, the referenced workflow template (workflow.json), and workflow action and event filter settings. In this example, the workflow is subscribed to events that contain the key-value pairs application: esaytrade and stage: production. Besides, the guardian with the name easytradeguardian is referenced as being executed by the workflow action.

Guardian configuration

The configuration for a guardian consists of the general guardian properties followed by a list of objectives. This is shown in the following example.

{
"name": "Easy Trade K8s workload",
"description": "Safeguard your Kubernetes environment with dedicated resource utilization objectives for a Kubernetes workload",
"tags": [
"stage:production",
"application:easytrade",
"deployment:k8s-workload"
],
"objectives": [
{
"name": "Memory requests",
"description": "The requested memory by the sum of the memory requests of all containers in a pod",
"objectiveType": "DQL",
"dqlQuery": "timeseries val = max(dt.kubernetes.workload.requests_memory), filter: in(dt.entity.cloud_application, \"CLOUD_APPLICATION-PLACEHOLDER\")\n| fields max = arrayMax(val)",
"comparisonOperator": "LESS_THAN_OR_EQUAL",
"target": {{ .thresholdMemoryRequests.target }},
"warning": {{ .thresholdMemoryRequests.warning }}
}
]
}

Guardian properties

These are the name, description, and a list of tags.

Objective definition

Each objective consists of

  • name: The name of the objective.
  • description: A description of the objective
  • objectiveType: Defines if the objective is DQL or REFERENCE_SLO based.
  • dqlQuery: The DQL query for a DQL-based objective.
  • referenceSlo: The function name of a SLO starting with func:slo.
  • comparisonOperator: Defines the operator of the objective to protect against an increase or decrease.
  • target: The failure threshold of the objective. In the above example, the failure threshold is provided by the config.yaml file.
  • warning: The warning threshold of the objective. In the above example, the warning threshold is provided by the config.yaml file.

Workflow configuration

The minimal configuration for a guardian workflow consists of two main parts: the workflow trigger and the workflow action. This is shown in the following example:

{
"description": "",
"isPrivate": true,
"labels": {},
"taskDefaults": {},
"title": "{{.name}}",
"trigger": {
"eventTrigger": {
"filterQuery": "application == \"{{ .eventFilters.application }}\" and\nstage == \"{{ .eventFilters.stage }}\"",
"isActive": true,
"triggerConfiguration": {
"type": "event",
"value": {
"eventType": "bizevents",
"query": "application == \"{{ .eventFilters.application }}\" and\nstage == \"{{ .eventFilters.stage }}\""
}
},
"uniqueExpression": null
}
},
"triggerType": "Event",
"tasks": {
"run_validation": {
"action": "dynatrace.site.reliability.guardian:validate-guardian-action",
"description": "Automation action to start a guardian validation",
"input": {
"executionId": "{{`{{`}} execution().id {{`}}`}}",
"objectId": "{{.guardianid}}",
"timeframeInputType": "timeframeSelector",
"timeframeSelector": {
"from": "now-30m",
"to": "now"
}
},
"name": "run_validation",
"position": {
"x": 0,
"y": 1
},
"predecessors": []
}
},
"usages": [],
"version": 2
}
  • Workflow trigger: In the example, the workflow triggered by an event. The event filter is configured to look for business events where the properties application and stage match the values defined in the config.yaml. Consequently, the workflow execution triggers by a business event containining application: easytravel and stage: production.

  • Workflow action: In the example, the workflow action to execute a guardian validation refers to the guardian specified in the config.yaml—the easytradeguardian guardian. Besides, it defines the validation timeframe—the last 30 minutes.

Configuration via Terraform

To leverage the Dynatrace Terraform Provider for managing a guardian and its workflow, you need two resources: