Monitor Kubernetes

  • How-to guide
  • 5-min read

The OpenTelemetry Collector provides extensive support for Kubernetes cluster and workload monitoring. It supports various receivers to collect critical metrics about the Kubernetes cluster, nodes, and objects.

This use case explains how to set up your OpenTelemetry Collector to get full visibility into your Kubernetes clusters through ready-made dashboards provided by Dashboards Dashboards.

Dynatrace Operator

Dynatrace recommends using the Dynatrace Operator for Kubernetes monitoring. However, this use case is designed specifically for OpenTelemetry users who choose not to deploy the Dynatrace Operator. It leverages pure OpenTelemetry data, which is not integrated into the Dynatrace Kubernetes application. Despite this, it provides ready-made Kubernetes dashboards and enables users to create custom dashboards and notebooks using the collected telemetry data.

Prerequisites

See Collector Deployment and Collector Configuration on how to set these up with the configurations provided below.

Demo configurations

RBAC configuration

Configure the following rbac.yaml file with your Kubernetes instance, to allow the Collector to use the Kubernetes API with the service-account authentication type.

apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app: otelcol-dt
name: otelcol-dt
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: otelcol-dt
labels:
app: otelcol-dt
rules:
- apiGroups:
- ""
resources:
- events
- namespaces
- namespaces/status
- nodes
- nodes/spec
- nodes/stats
- nodes/proxy
- persistentvolumes
- persistentvolumeclaims
- pods
- pods/status
- replicationcontrollers
- replicationcontrollers/status
- resourcequotas
- services
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- replicasets
- statefulsets
verbs:
- get
- list
- watch
- apiGroups:
- batch
resources:
- jobs
- cronjobs
verbs:
- get
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: otelcol-dt
labels:
app: otelcol-dt
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: otelcol-dt
subjects:
- kind: ServiceAccount
name: otelcol-dt
namespace: default

Collector configuration

Service account

In addition to the Collector configuration, be sure to also update your Kubernetes configuration to match the service account name used in the RBAC file (see entries for Helm and Operator).

extensions:
health_check:
endpoint: 0.0.0.0:13133
k8s_leader_elector:
auth_type: "serviceAccount"
lease_name: k8smonitoring
lease_namespace: ${env:POD_NAMESPACE}
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
k8s_events:
auth_type: "serviceAccount"
k8s_leader_elector: k8s_leader_elector
kubeletstats:
auth_type: "serviceAccount"
collection_interval: 10s
node: '${env:K8S_NODE_NAME}'
extra_metadata_labels:
- k8s.volume.type
k8s_api_config:
auth_type: "serviceAccount"
endpoint: "https://${env:K8S_NODE_NAME}:10250"
insecure_skip_verify: true
metric_groups:
- node
- pod
- container
- volume
k8s_cluster:
auth_type: "serviceAccount"
collection_interval: 10s
k8s_leader_elector: k8s_leader_elector
allocatable_types_to_report:
- cpu
- memory
- pods
node_conditions_to_report:
- Ready
- MemoryPressure
- PIDPressure
- DiskPressure
- NetworkUnavailable
metrics:
k8s.node.condition:
enabled: true
k8s.pod.status_reason:
enabled: true
processors:
cumulativetodelta:
filter:
error_mode: ignore
metrics:
metric:
- 'IsMatch(name, "k8s.volume.*") and resource.attributes["k8s.volume.type"] == nil'
- 'resource.attributes["k8s.volume.type"] == "configMap"'
- 'resource.attributes["k8s.volume.type"] == "emptyDir"'
- 'resource.attributes["k8s.volume.type"] == "secret"'
transform:
error_mode: ignore
trace_statements: &dynatrace_transformations
# Set attributes taken from k8s metadata.
- context: resource
statements:
- set(attributes["k8s.cluster.name"], "${env:CLUSTER_NAME}")
- set(attributes["k8s.workload.kind"], "job") where IsString(attributes["k8s.job.name"])
- set(attributes["k8s.workload.name"], attributes["k8s.job.name"]) where IsString(attributes["k8s.job.name"])
- set(attributes["k8s.workload.kind"], "cronjob") where IsString(attributes["k8s.cronjob.name"])
- set(attributes["k8s.workload.name"], attributes["k8s.cronjob.name"]) where IsString(attributes["k8s.cronjob.name"])
- set(attributes["k8s.workload.kind"], "daemonset") where IsString(attributes["k8s.daemonset.name"])
- set(attributes["k8s.workload.name"], attributes["k8s.daemonset.name"]) where IsString(attributes["k8s.daemonset.name"])
- set(attributes["k8s.workload.kind"], "statefulset") where IsString(attributes["k8s.statefulset.name"])
- set(attributes["k8s.workload.name"], attributes["k8s.statefulset.name"]) where IsString(attributes["k8s.statefulset.name"])
- set(attributes["k8s.workload.kind"], "replicaset") where IsString(attributes["k8s.replicaset.name"])
- set(attributes["k8s.workload.name"], attributes["k8s.replicaset.name"]) where IsString(attributes["k8s.replicaset.name"])
- set(attributes["k8s.workload.kind"], "deployment") where IsString(attributes["k8s.deployment.name"])
- set(attributes["k8s.workload.name"], attributes["k8s.deployment.name"]) where IsString(attributes["k8s.deployment.name"])
# remove the delete statements if you want to preserve these attributes
- delete_key(attributes, "k8s.deployment.name")
- delete_key(attributes, "k8s.replicaset.name")
- delete_key(attributes, "k8s.statefulset.name")
- delete_key(attributes, "k8s.daemonset.name")
- delete_key(attributes, "k8s.cronjob.name")
- delete_key(attributes, "k8s.job.name")
# Set attributes from metadata specified in Dynatrace and set through the Dynatrace Operator.
# For more info: https://docs.dynatrace.com/docs/shortlink/k8s-metadata-telemetry-enrichment
- context: resource
statements:
- merge_maps(attributes, ParseJSON(attributes["metadata.dynatrace.com"]), "upsert") where IsMatch(attributes["metadata.dynatrace.com"], "^\\{")
- delete_key(attributes, "metadata.dynatrace.com")
metric_statements: *dynatrace_transformations
log_statements:
- context: resource
statements:
- set(attributes["k8s.cluster.name"], "${env:CLUSTER_NAME}")
k8sattributes:
extract:
metadata:
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.ip
- k8s.deployment.name
- k8s.replicaset.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.job.name
- k8s.cronjob.name
- k8s.namespace.name
- k8s.node.name
- k8s.cluster.uid
- k8s.container.name
annotations:
- from: pod
key_regex: metadata.dynatrace.com/(.*)
tag_name: $$1
- from: pod
key: metadata.dynatrace.com
tag_name: metadata.dynatrace.com
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.name
- from: resource_attribute
name: k8s.namespace.name
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
exporters:
otlphttp:
endpoint: ${env:DT_ENDPOINT}
headers:
Authorization: "Api-Token ${env:DT_API_TOKEN}"
service:
extensions:
- health_check
- k8s_leader_elector
pipelines:
metrics/node:
receivers:
- kubeletstats
processors:
- filter
- k8sattributes
- transform
- cumulativetodelta
exporters:
- otlphttp
metrics:
receivers:
- k8s_cluster
processors:
- k8sattributes
- transform
- cumulativetodelta
exporters:
- otlphttp
logs:
receivers:
- k8s_events
processors:
- transform
exporters:
- otlphttp
traces:
receivers:
- otlp
processors:
- k8sattributes
- transform
exporters:
- otlphttp
Configuration validation

Validate your settings to avoid any configuration issues.

Components

For our configuration, we configured the following components:

Receivers

Under receivers, we specify the following receivers as active receiver components for our deployment:

(*) The k8seventsreceiver is currently in alpha stage and may undergo significant changes. Despite its early stage of maturity, it has been included in the Dynatrace distribution of the OpenTelemetry Collector to support early adoption and experimentation. Be aware that stability, performance, and feature completeness are not guaranteed at this stage.

Processors

Under processors, we specify the following processors:

  • filter: To filter Kubernetes attributes.

  • k8sattributes: To extract and provide pod data.

  • transform: To transform Kubernetes metrics. This requires the environment variable CLUSTER_NAME to be set with the name of the cluster. Set the variable value to an arbitrary name that you want your cluster to show up with inside Dynatrace.

  • cumulativetodelta: To enable conversion of cumulative metrics.

Exporters

Under exporters, we specify the otlphttp exporter and configure it with our Dynatrace API URL and the required authentication token.

For this purpose, we set the following two environment variables and reference them in the configuration values for endpoint and Authorization.

Extensions

Under extensions, we specify the k8sleaderelector extension to choose the leader of the agent replicas, which is going to scrape and export cluster-level telemetry. This ensures that only one collector instance scrapes the data at a time to avoid telemetry duplication.

Service pipelines

Under service, we assemble our receiver, processor, and exporter objects into pipelines for traces, metrics, and logs. These pipelines allow us to send OpenTelemetry signals via the Collector instance and have them automatically enriched with additional Kubernetes-specific details.

Install and access ready-made dashboards

Install dashboards

To make these dashboards available in your tenant, install the OpenTelemetry Dashboards app via the Dynatrace Hub.

Installing the OpenTelemetry Dashboards app

Access dashboards

How to find ready-made dashboards

For how to find and use ready-made dashboards, see Ready-made dashboards.

When accessing the dashboards, you have the following choices:

  • OpenTelemetry K8s Cluster: A comprehensive overview of cluster performance, including nodes, pods, containers, events, and more.
  • OpenTelemetry K8s Node - Pods: Visualizes pod-level resource consumption across Kubernetes nodes.
  • OpenTelemetry K8s Namespace - Pods: Analyzes resource usage of all pods within a specific namespace.
  • OpenTelemetry K8s Namespace - Workloads: Focuses on resource allocation and performance of workloads within a namespace.
  • OpenTelemetry K8s Persistent Volumes: Monitors usage and capacity of persistent volume claims.

cluster dashboard screenshot

Limits and limitations

Data is ingested using the OpenTelemetry protocol (OTLP) via the Dynatrace OTLP APIs and is subject to the API's limits and restrictions. For more information see:

Related tags
Application Observability