Kubernetes app overview

Latest Dynatrace

The new Kubernetes app Kubernetes (new) provides a comprehensive view of your environment, enabling you to automate monitoring and optimize the health and performance of your Kubernetes clusters and workloads. This guide walks you through the main concepts underlying the Kubernetes app.

Prerequisites

  • Dynatrace SaaS environment powered by Grail and AppEngine
  • DPS license that includes the Kubernetes Platform Monitoring capability
  • Sufficient permissions to use the Kubernetes app within your Dynatrace environment
  • ActiveGate version 1.279+

For more details, see getting started FAQs.

The new Kubernetes experience is not available for Managed or SaaS on non-Grail environments—you can continue to use Kubernetes Classic (accessible from the previous Dynatrace via Kubernetes).

Basic structure

The Kubernetes app offers insights into your entire Kubernetes environment, presenting valuable information across primary areas as indicated in the picture below.

Kubernetes app: Clusters

On the left side (1), you can find a sidebar with all Kubernetes objects grouped by type, such as clusters, nodes, and workloads. In the center of the app (2), the main view lists all objects of the selected type, serving as the starting point for analysis and drill-down for your observability use-cases. Above this table (3), there's an aggregated health status of the displayed objects and their child objects. Finally, the filter bar located below the app header (4) allows you to refine the information in the list view, focusing on specific objects or health statuses.

Selecting a Kubernetes object from the list opens a detail view focusing on the specific object.

Kubernetes app: object details

The detail view is divided into two primary sections. At the top (1), you get the health and security status of the selected object and its child objects. The main section (2) provides detailed insights of the given object, featuring tabs for analyzing health and utilization, as well as for exploring logs, events, ownership, and vulnerabilities. The data presented in the detailed view remains consistent regardless of any filters applied in the main interface.

Perspectives

Perspectives support various use cases, such as health monitoring or resource optimization.

Kubernetes app: perspectives

Selecting a perspective (1) determines the columns displayed in the table view. For example, the Health perspective provides health-related information, whereas the Utilization perspective focuses on resource utilization data. Each perspective can be tailored to your own requirements (2) by adding or removing columns as desired. Your personal configuration persists in your browser, and you can reset to the default layout at any time by selecting More (…) next to the list of available perspectives (1).

Davis AI health status

The health status is based on the Kubernetes-focused anomaly detectors. Health indicators aggregate the states of these anomaly detectors per resource.

A Kubernetes object (such as a cluster) is considered unhealthy if any of its associated anomaly detectors are in an unhealthy state. By selecting a specific health indicator, you can gain further insights into the underlying reasons for this status.

Davis AI health status

In this example, you can see that, out of 32 nodes, 2 are currently considered unhealthy.

  1. Select the red numbers displayed within the health status area to drill down to the list of currently unhealthy nodes.

    Warning events

  2. Select any node to open the details view of the problematic node, including key metrics and events that led to their current state.

    Warning events 2

    Anomaly detection for Kubernetes comes with effective defaults, but you can customize them to suit your environment's needs. Additionally, the Anomaly detectors page within the app provides a quick overview of your current configuration status

    Anomaly detection

Warning signals

In addition to health status, the Kubernetes app surfaces active problematic conditions of workloads and nodes, as well as warning signals that occurred in the last 10 minutes. These warning signals combine both problematic conditions and warning events, providing insight into potential upcoming problems or existing misconfigurations.

While they may not always represent active health issues at the moment, frequent Unhealthy signals, for instance, might indicate misconfigured readiness probes, inappropriate CPU limits, or unusually high workload.

Warning signals

Column
Content
Examples
Node warning signals
DiskPressure, MemoryPressure, NodeNotReady
Pod warning signals
BackOff, PodEviction, OOMKilled
Workload warning signals
CPUThrottlingHigh, ContainerRestarts, PodsPending