Ready-made dashboards

  • Latest Dynatrace
  • Reference
  • 9-min read

Dynatrace ready-made dashboards offer preconfigured data visualizations and filters designed for common scenarios like troubleshooting and optimization.

  • Use them right out of the box
  • Save a copy and customize your copy
Where to find ready-made dashboards
  1. In Dynatrace, go to Dashboards Dashboards.

  2. Choose a way to list all ready-made dashboards.

  3. Select the ready-made dashboard you want to use.
    Try the Explore in Playground links below to see them in action.

Using read-only dashboards

When you open a document (dashboard or notebook) for which you don't have write permission, you can still edit the document during your session. After you're finished, you have two options:

  • Save your changes to a new document
  • Discard your changes

Example:

  1. Go to Dashboards Dashboards, list the ready-made dashboards, and select the Getting started dashboard.

    It says Ready-made in the upper-left corner, next to the document name.

  2. Select the Pie chart tile and then select Edit.

  3. Change the visualization from Pie to Donut.

    Now you are offered two buttons: Save as new and Discard changes.

  4. Use the updated dashboard as needed. You have full edit access for this session.

  5. When you're finished, select what to do with your changes:

    • Save as new—saves your changes in a new copy of the edited dashboard.
    • Discard changes—discards your changes and returns you to the unedited read-only dashboard.

AI Observability AI Observability

Explore ready-made dashboards owned by AI Observability AI Observability.

AI Data governance and audit trail - AI Observability

Track AI service usage trends and audit events across your environment. Identify which models are called by which users, trace request activity over time, and maintain compliance records for AI interactions.

AI Model versioning and A/B Testing - AI Observability

Compare two AI model versions or providers side by side. Track response time, token cost, and request volume per model to decide which variant performs better before a full rollout.

Amazon Bedrock - AI Observability

Monitor Amazon Bedrock model health and token usage. Spot the most expensive prompts, detect PII leaks and denied topics, and trace slow or failing model invocations end to end.

The Amazon Bedrock - AI Observability dashboard contains the following sections and tiles:

  • Token Usage Forecast Dot Dynatrace Intelligence forecast

Amazon Bedrock Service Health & Performance

  • Open Problems Dot single value
  • Top 10 expensive prompts Dot table
  • Denied Topics Dot single value
  • PII Leaks Dot single value
  • Toxicity Dot single value
  • Guardrail Executions Dot single value
  • Filtered Content Dot single value
  • Prevented PII Leaks Dot single value
  • Blocked Toxic Prompts Dot single value
  • Overall Guardrail Activation Dot single value
  • Cost Dot single value
  • Number of Total Requests Dot single value
  • Service Health Dot pie chart
  • P99 Request Duration Dot single value
  • AVG Request Duration Dot single value
  • Top 10 slowest prompts Dot table
  • Grounding Dot single value
  • Relevance Dot single value
  • $ Saved Dot single value
  • AVG Time Saved Dot single value
  • Cache Hit Dot single value
  • AVG Cache Read Tokens Dot single value
  • AVG Cache Write Tokens Dot single value
  • Total Token Consumption Dot single value
  • Completion Token Dot single value
  • Prompt Token Dot single value

Identify which model is costing more based on the incoming amount of requests

  • Top 10 expensive prompts Dot table

Azure AI Foundry - AI Observability

Monitor Azure AI Foundry model health and performance. Track request volume, response time, cost per model, and P99 latency to identify expensive or unreliable AI calls.

The Azure AI Foundry - AI Observability dashboard contains the following sections and tiles:

  • Response Time per Model Dot line chart
  • Cost Dot single value
  • Number of Total Requests Dot single value
  • Service Health Dot pie chart
  • P99 Request Duration Dot single value
  • AVG Request Duration Dot single value
  • Open Problems Dot single value

Azure AI Foundry Service Health & Performance

  • Token Usage Forecast Dot Dynatrace Intelligence forecast
  • Token Consumption per Model Dot line chart
  • Top 10 expensive prompts Dot table
  • Top 10 slowest prompts Dot table
  • $ Saved Dot single value
  • AVG Time Saved Dot single value
  • Cache Hit Dot single value
  • AVG Cache Read Tokens Dot single value

AI Observability

  • Total Token Consumption Dot single value
  • Completion Token Dot single value
  • Prompt Token Dot single value

Identify which model is costing more based on the incoming amount of requests

  • Top 10 expensive prompts Dot table

Google Gemini and Vertex AI Studio - AI Observability

Monitor Google Vertex AI and Gemini model performance end to end. Track request counts, response time, cost per model, and P99 latency across your AI application.

The Google Gemini and Vertex AI Studio - AI Observability dashboard contains the following sections and tiles:

  • Response Time per Model Dot line chart
  • Cost Dot single value
  • Number of Total Requests Dot single value
  • Service Health Dot pie chart
  • P99 Request Duration Dot single value
  • AVG Request Duration Dot single value
  • Open Problems Dot single value

VertexAI and Gemini Service Health & Performance

  • Token Usage Forecast Dot Dynatrace Intelligence forecast
  • Token Consumption per Model Dot line chart
  • Top 10 expensive prompts Dot table
  • Top 10 slowest prompts Dot table

AI Observability

  • Total Token Consumption Dot single value
  • Completion Token Dot single value
  • Prompt Token Dot single value

Identify which model is costing more based on the incoming amount of requests

  • Top 10 expensive prompts Dot table

Kong AI - AI Observability

Monitor AI applications built on Kong AI Gateway. Track request counts by model, token consumption forecasts, P99 latency, and service health.

The Kong AI - AI Observability dashboard contains the following sections and tiles:

  • AI requests total per AI model Dot bar chart
  • Forecast Token Consumption Dot Dynatrace Intelligence forecast
  • Service Health Dot pie chart
  • Number of Total Requests Dot single value
  • P99 Request Duration Dot single value
  • AVG Request Duration Dot single value
  • Token Usage Forecast Dot Dynatrace Intelligence forecast
  • Token Consumption Dot single value
  • Total Token Consumption Dot single value
  • Completion Token Dot single value
  • Prompt Token Dot single value
  • AI latency per service/route Dot line chart
  • Forecast Token Consumption per AI Model Dot Dynatrace Intelligence forecast
  • Token Consumption per AI Model Dot pie chart

AI Observability

  • Open Problems Dot single value

NVIDIA - AI Observability

Monitor AI applications built with NVIDIA NIM. Track request counts, average and P99 response duration, token cost estimates, and open problems.

The NVIDIA - AI Observability dashboard contains the following sections and tiles:

  • DQL Cost Calculation (1token = 1$) Dot single value
  • AVG Request Duration Dot single value
  • P99 Request Duration Dot single value
  • Number of Total Requests Dot single value
  • Open Problems Dot single value
  • Time To First Token Dot single value
  • Throughput (tokens/second) Dot single value
  • KV Cache Utilization Dot single value
  • Number of Running Requests Dot single value
  • Token Usage Forecast Dot Dynatrace Intelligence forecast
  • Token Consumption per Model Dot line chart
  • Response Time per Model Dot line chart
  • Top 10 expensive prompts Dot table
  • Top 10 slowest prompts Dot table

AI Observability

  • Service Health Dot pie chart

OpenAI - AI Observability

Monitor OpenAI and Azure OpenAI service health, request counts, response time, and cost. Identify which models are most expensive and trace the slowest or costliest prompts.

The OpenAI - AI Observability dashboard contains the following sections and tiles:

AI Observability

  • Cost Dot single value
  • Number of Total Requests Dot single value
  • Open Problems Dot single value
  • Service Health Dot pie chart
  • AVG Request Duration Dot single value
  • P99 Request Duration Dot single value
  • Token Usage Forecast Dot Dynatrace Intelligence forecast
  • $ Saved Dot single value
  • AVG Time Saved Dot single value
  • Cache Hit Dot single value
  • AVG Cache Read Tokens Dot single value
  • Response Time per Model Dot line chart
  • Token Consumption per Model Dot line chart
  • Total Token Consumption Dot single value
  • Completion Token Dot single value
  • Prompt Token Dot single value

Identify which model is costing more based on the incoming amount of requests

  • Top 10 expensive prompts Dot table

Find the trace id of the most expensive prompts to investigate more deeply the costs

  • Top 10 expensive prompts Dot table
  • Top 10 slowest prompts Dot table

Amazon ECR

Explore ready-made dashboards owned by Amazon ECR.

Container Scan Events Coverage

Identify coverage gaps in container image scanning. View scan coverage by security product and see the latest 50 scan events across registries, repositories, and images.

The Container Scan Events Coverage dashboard contains the following sections and tiles:

Coverage report for container image scan events

  • Container image coverage by product Dot categorical chart
  • Registries Dot single value
  • Container repositories Dot single value
  • Container images Dot single value
  • Scanning products Dot single value

Coverage overview

  • Scan events over time by product Dot bar chart
  • Total scan events Dot single value
  • Repository coverage based on products and number of scans Dot table

Container Vulnerability Findings

Visualize container vulnerability findings by risk level. Break down critical and high findings by registry and repository to prioritize remediation across your container environment.

The Container Vulnerability Findings dashboard contains the following sections and tiles:

Container vulnerability findings

  • Number of critical findings by registry Dot donut chart
  • Critical risk Dot single value
  • High risk Dot single value
  • Number of critical findings by repository Dot donut chart
  • Number of vulnerabilities by risk Dot donut chart
  • Affected registries Dot single value
  • Container repositories Dot single value
  • Container images Dot single value
  • Vulnerable components Dot single value

Vulnerabilities by risk

  • Medium risk Dot single value
  • Vulnerability findings over time by provider Dot bar chart

Top 10 affected registries by number of critical findings

  • Total ingested findings Dot single value

Runtime contextualization of container findings for alert reduction

Reduce container alert noise by correlating vulnerability findings with runtime context. View which findings are present in running containers versus only repositories to prioritize response.

The Runtime contextualization of container findings for alert reduction dashboard contains the following sections and tiles:

Runtime contextualization of container findings for alert reduction

  • Critical risk Dot single value
  • High risk Dot single value
  • Number of vulnerabilities by risk Dot donut chart
  • Medium risk Dot single value
  • Percentage of vulnerabilities by funnel stage Dot categorical chart

Top 10 vulnerabilities

  • Critical risk Dot single value
  • High risk Dot single value
  • Medium risk Dot single value
  • Number of vulnerabilities by risk Dot donut chart
  • Critical risk Dot single value
  • Medium risk Dot single value
  • High risk Dot single value

Vulnerabilities in running containers

  • Number of vulnerabilities by risk Dot donut chart

Vulnerabilities in production containers

  • Container images in registries Dot single value
  • Container images in runtime Dot single value
  • Container images in production Dot single value

Amazon GuardDuty

Explore ready-made dashboards owned by Amazon GuardDuty.

Security findings

Overview of Amazon GuardDuty security findings by risk level. View affected objects and the latest 50 findings to focus remediation on the highest-risk issues.

The Security findings dashboard contains the following sections and tiles:

Security findings

  • Critical Dot single value
  • High Dot single value
  • Number of unique findings by risk Dot donut chart
  • Critical Dot single value

Findings by risk

  • Medium Dot single value
  • Findings over time by provider Dot bar chart
  • High Dot single value

Latest 50 security findings

  • Findings by type Dot categorical chart
  • Top 10 object types by risk Dot categorical chart
  • Top 10 products by risk Dot categorical chart
  • Medium Dot single value
  • Number of objects by risk Dot donut chart
  • Top 10 findings by risk and number of affected objects Dot table
  • Top 10 affected objects by number of findings Dot table

Affected runtime entities

  • Top 10 vulnerable host entities by finding criticality Dot table
  • Number of host entities by risk Dot donut chart
  • Top 10 vulnerable container workloads by finding criticality Dot table
  • Number of container workloads by risk Dot donut chart
  • Total ingested findings Dot single value
  • Number of cloud entities by risk Dot donut chart
  • Top 10 vulnerable cloud entities by finding criticality Dot table

Security product coverage

View security product coverage and scan event ingestion from Amazon GuardDuty. Track reporting providers, scan event counts over time, and runtime coverage of hosts and container workloads.

The Security product coverage dashboard contains the following sections and tiles:

Coverage overview

  • Security events per top 10 products Dot categorical chart
  • Ingested finding events by provider over time Dot bar chart
  • Scan events Dot single value
  • Reporting providers Dot single value
  • Ingested scan events over time Dot bar chart
  • Finding events Dot single value
  • Security events by object coverage per product Dot table
  • Security events by findings number per object type Dot table

Runtime entity coverage: Hosts

  • Security events per top 10 object types Dot categorical chart

Runtime entity coverage: Container workloads

  • Container workload coverage Dot donut chart
  • Host coverage by product Dot table
  • Host coverage Dot donut chart
  • Container workload coverage by product Dot table
  • Last 10 covered hosts Dot table
  • Last 10 covered container workloads Dot table

Runtime entity coverage: Cloud entities

  • Last 10 covered cloud entities Dot table
  • Cloud entity coverage by product Dot table
  • Cloud entity coverage Dot donut chart

Anomaly Detection - new Anomaly Detection app

Explore ready-made dashboards owned by Anomaly Detection - new Anomaly Detection app.

Alert configuration health status

Track the health of custom alert detectors on your tenant. Identify failing detectors, the most common error messages, and which detectors trigger the most alerts.

The Alert configuration health status dashboard contains the following sections and tiles:

  • Overall custom alerts last 24h Dot honeycomb chart
  • Last 24h Dot single value
  • Last 24h Dot single value
  • Last 24h Dot single value

Custom alert health status

  • Most common error messages Dot categorical chart
  • Error messages breakdown Dot bar chart

Breakdown by billed bytes

  • Top data usage per Custom Alerts Last 24h Dot table
  • Top data usage per Custom Alerts Last 24h Dot line chart
  • Summarized messages by config_id Dot table

Breakdown by error messages

  • Summarized alerts by config_id over the last 24h Dot table
  • Summarized alerts by config_id over the last 24h Dot honeycomb chart

Health alert health status

  • Last execution of alert configs Dot honeycomb chart

    Overall of all last execution result for each selected alert configuration.

  • Last execution - Success Dot single value

    Count the last success execution result for each selected alert configuration.

  • Last execution - Warning Dot single value

    Count the last warning execution result for each selected alert configuration.

  • Last execution - Failed Dot single value

    Count the last failed execution result for each selected alert configuration.

  • Warning or Failure Events Dot bar chart

    Shows all warning or failure events for each selected alert configuration.

  • FAILED Message Dot table

  • WARNING Message Dot table

Clouds Clouds app

Explore ready-made dashboards owned by Clouds Clouds app.

AWS API

Track error rates, request volumes, and latency across AWS HTTP and REST API Gateways. Drill into individual gateway instances to pinpoint the source of elevated 4xx or 5xx responses.

The AWS API dashboard contains the following sections and tiles:

AWS API Gateway

  • 5xx errors Dot single value

  • API requests Dot single value

  • 4xx errors Dot single value

  • Integration latency Dot line chart

    The time between when API Gateway relays a request to the backend and when it receives a response from the backend.

  • Latency Dot line chart

    The time between when API Gateway receives a request from a client and when it returns a response to the client. The latency includes the integration latency and other API Gateway overhead.

  • HTTP APIs error rate Dot single value

  • Errors by API/stage Dot categorical chart

  • Cache hits Dot single value

  • Cache misses Dot single value

HTTP APIs

  • REST APIs error rate Dot single value

  • Data processed Dot line chart

  • Errors by API/stage Dot categorical chart

  • API requests Dot single value

  • 5xx errors Dot single value

  • 4xx errors Dot single value

  • Integration latency Dot line chart

    The time between when API Gateway relays a request to the backend and when it receives a response from the backend.

  • Latency Dot line chart

    The time between when API Gateway receives a request from a client and when it returns a response to the client. The latency includes the integration latency and other API Gateway overhead.

AWS Bedrock

View AWS Bedrock invocation counts, throttle rates, guardrail events, and average response time. Identify which models are driving the most traffic and where errors or throttles are occurring.

The AWS Bedrock dashboard contains the following sections and tiles:

Amazon Bedrock

  • Invocations Dot line chart

  • Total invocations Dot single value

  • Average Total Time Dot single value

    The time it took for the server to process the request.

  • Invocation Throttles Dot line chart

    Agent throttles

  • Client vs Server errors Dot line chart

  • Input token vs Output token count Dot bar chart

Guardrail

  • Top $Limit models per latency Dot line chart

  • Agents Dot single value

  • Agent Alias Dot single value

  • Guardrails Dot single value

  • Total Time Dot line chart

    The time it took for the server to process the request.

  • Invocations Dot line chart

    Successful agent invocations

  • Invocations Intervened Dot line chart

    Successful agent invocations

  • Findings count Dot line chart

    Successful agent invocations

Latency

  • Agent Alias per Agent Dot categorical chart

  • Top $Limit models per invocations Dot line chart

  • Total errors Dot single value

    The time it took for the server to process the request.

AWS DynamoDB

View the status of DynamoDB tables, including capacity unit usage, throttle rates, and latency. Spot user and system errors and track item return rates to detect unexpected query patterns.

The AWS DynamoDB dashboard contains the following sections and tiles:

AWS DynamoDB

  • User errors Dot single value

    Requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 400 status code during the specified time period.

  • System errors Dot single value

    The requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 500 status code during the specified time period.

  • Tables Dot single value

  • Successful request latency Dot line chart

  • Returned items Dot single value

    The number of items returned by Query, Scan or ExecuteStatement (select) operations during the specified time period.

  • Conditional check failed requests Dot single value

    The number of failed attempts to perform conditional writes.

  • Throttled requests Dot single value

    Requests to DynamoDB that exceed the provisioned throughput limits on a resource (such as a table or an index).

  • TTL deleted items Dot single value

  • Consumed read capacity units Dot line chart

  • Consumed write capacity units Dot line chart

  • Read throttle events Dot line chart

    Requests to DynamoDB that exceed the provisioned read capacity units for a table over the specified time period.

  • Provisioned read capacity units Dot line chart

  • Provisioned write capacity units Dot line chart

  • Write throttle events Dot line chart

    Requests to DynamoDB that exceed the provisioned write capacity units for a table over the specified time period.

Throttles and latency

  • Total consumed read capacity units Dot single value
  • Total consumed write capacity units Dot single value
  • Total provisioned read capacity units Dot single value
  • Total provisioned write capacity units Dot single value

AWS EC2

View CPU utilization, network traffic, and disk activity for EC2 instances. See the breakdown by instance type, region, and account, plus Auto Scaling group status.

The AWS EC2 dashboard contains the following sections and tiles:

  • CPU utilization Dot line chart
  • EC2 instances per type Dot categorical chart

AWS EC2

  • EC2 instances per region Dot categorical chart

  • Active EC2 instances Dot single value

  • Total network input Dot single value

    The total number of bytes received by the instance on all network interfaces.

  • Total network output Dot single value

    The total number of bytes sent out by the instance on all network interfaces.

  • Network input Dot line chart

  • Network output Dot line chart

  • Read bytes Dot line chart

  • Write bytes Dot line chart

  • Read operations Dot line chart

  • Write operations Dot line chart

Disk activity

  • CPU utilization for instances with highest usage Dot categorical chart

    The most recent percentage of physical CPU time that Amazon EC2 uses to run the EC2 instance, which includes time spent to run both the user code and the Amazon EC2 code.

  • Volumes idle time Dot bar chart

    The total number of seconds in a specified period of time when no read or write operations were submitted. High idle time indicates underutilized resources like EBS volume attached to an EC2 instance that is not actively used.

  • Volumes queue length Dot bar chart

    The number of read and write operation requests waiting to be completed in a specified period of time.

  • Burst balance percentage Dot line chart

    Percentage of I/O credits (for gp2) or throughput credits (for st1 and sc1) remaining in the burst bucket.

AWS Auto Scaling groups

  • Auto Scaling groups by group max size Dot categorical chart

  • Auto Scaling groups by desired capacity Dot categorical chart

    The number of instances that the Auto Scaling groups attempt to maintain.

  • In-service instances Dot line chart

    The number of instances that are running as part of the Auto Scaling group.

  • Pending instances Dot bar chart

    The number of instances that are pending. A pending instance is not yet in service.

  • Standby instances Dot bar chart

    The number of instances that are in a Standby state. Instances in this state are still running but are not actively in service.

  • Terminating instances Dot line chart

    The number of instances that are in the process of terminating. This metric does not include instances that are in service, pending, or returning to a warm pool after Auto Scaling group scale in.

Network

  • Status Check failures Dot line chart
  • CPU Credits Balance Dot line chart

AWS ECS

Track CPU and memory reservation and utilization for ECS tasks across clusters. Monitor network I/O to spot containers under resource pressure.

The AWS ECS dashboard contains the following sections and tiles:

AWS Elastic Container Service

  • Average CPU units utilized Dot single value
  • Average CPU units reserved Dot single value
  • Average memory utilized Dot single value
  • Average memory reserved Dot single value
  • CPU units utilized Dot line chart
  • CPU units reserved Dot line chart
  • Memory utilized Dot line chart
  • Memory reserved Dot line chart
  • Network transmitted Dot line chart
  • Network received Dot line chart
  • Average network received Dot single value
  • Average network transmitted Dot single value

Storage

  • Storage write bytes Dot line chart
  • Storage read bytes Dot line chart
  • Average storage read bytes Dot single value
  • Average storage write bytes Dot single value

Container Insights

  • Services Dot single value

    The number of services in the clusters in a given period.

  • Container instances Dot single value

  • Deployments Dot single value

  • Tasks sets Dot single value

  • Tasks Dot single value

  • Pending tasks Dot single value

  • Desired tasks Dot single value

  • Running tasks Dot single value

  • Memory utilization by cluster Dot line chart

  • CPU utilization by cluster Dot line chart

  • Average CPU utilization Dot single value

  • Average memory utilization Dot single value

    The total percentage of memory being used by containers in the resource in a given period.

  • Ephemeral storage bytes reserved Dot line chart

  • Ephemeral storage bytes utilized Dot line chart

  • Average ephemeral storage bytes utilized Dot single value

  • Average ephemeral storage bytes reserved Dot single value

AWS Edge Networking

Monitor Route 53 health check status and CloudFront distribution performance. Track connection time, time to first byte, and health check outcomes by location.

The AWS Edge Networking dashboard contains the following sections and tiles:

AWS Edge Networking

  • Route 53 health checks Dot single value

  • Connection time Dot line chart

    The average time, in milliseconds, that it took Route 53 health checkers to establish a TCP connection with the endpoint.

  • Health checks status split Dot donut chart

  • Time to first byte Dot line chart

    The average time, in milliseconds, that it took Route 53 health checkers to receive the first byte of the response to an HTTP or HTTPS request.

  • Route 53 hosted zones Dot single value

Route 53 Health Checks

  • DNS queries per hosted zone Dot donut chart
  • CloudFront distributions Dot single value

CloudFront distributions

  • Bytes uploaded Dot line chart

    The total number of bytes that viewers uploaded to CloudFront, using OPTIONS, POST and PUT requests.

  • CloudFront distributions error rate Dot single value

  • Bytes downloaded Dot line chart

    The total number of bytes downloaded by viewers for GET and HEAD requests.

  • 4xx error rate Dot bar chart

    The percentage of all viewer requests for which the response's HTTP status code is 4xx.

  • 5xx error rate Dot bar chart

    The percentage of all viewer requests for which the response's HTTP status code is 5xx.

  • Total bytes downloaded Dot single value

  • Total bytes uploaded Dot single value

  • Total average connection time Dot single value

  • Total average time to first byte Dot single value

AWS EFS

View throughput, storage size, and client connection counts for EFS file systems. Identify throughput bottlenecks and track permitted throughput utilization over time.

The AWS EFS dashboard contains the following sections and tiles:

AWS Elastic File System

  • File systems Dot single value

  • Mounted targets Dot single value

  • File systems by client connections Dot categorical chart

  • File systems by storage size Dot categorical chart

  • Percentage of permitted throughput utilization Dot line chart

    Ratio between metered IO bytes and total permitted throughput, in percentage. If you are reaching maximum capacity, then you are consuming the entire amount of throughput allocated to your file system. In this situation, you might consider changing the file system's throughput mode to get higher throughput.

  • Burst credit balance Dot line chart

    The number of burst credits that a file system has. Burst credits allow a file system to burst to throughput levels above a file system’s baseline level for periods of time.

  • Total IO bytes Dot line chart

    The actual number of bytes for each file system operation processed by Amazon EFS, without any read discounts.

  • Total average percentage of permitted throughput utilization Dot single value

Usage

  • Total IO processed bytes Dot single value

AWS EKS

Monitor EKS cluster health, including pod and node resource usage, scheduler activity, and API server performance. Identify pending pods, webhook latency issues, and storage configuration.

The AWS EKS dashboard contains the following sections and tiles:

Amazon Elastic Kubernetes Service

  • CPU Usage total (amount) Dot line chart
  • Scheduler attempts Dot line chart
  • Scheduler pending pods Dot line chart
  • Webhook admission duration seconds Dot table
  • Storage size Dot table
  • CPU utilization Dot line chart
  • GPU usage total Dot line chart
  • Filesystem utilization Dot line chart
  • Memory utilization Dot line chart
  • Network total bytes Dot line chart
  • Running containers Dot line chart
  • CPU utilization Dot line chart
  • GPU usage total Dot line chart
  • Memory utilization Dot line chart
  • Network rx bytes Dot line chart
  • Container restarts Dot line chart
  • Admission webhook request total Dot table
  • APIServer request Dot line chart

API server

  • Cluster nodes Dot line chart
  • Running pods Dot line chart

Container Insights

  • All running pods Dot single value

AWS ElastiCache

View the status and resource usage of ElastiCache clusters for both Redis/Valkey and Memcached. Track cache hits and misses, current connections, and available cluster counts.

The AWS ElastiCache dashboard contains the following sections and tiles:

AWS Elasticache

  • Serverless caches by engine Dot categorical chart

  • Cache clusters by engine Dot categorical chart

  • Current connections Dot line chart

  • Hits and misses by cache Dot categorical chart

    Number of successful and unsuccessful key lookups in the cache.

  • Available cache clusters Dot donut chart

  • Available serverless caches Dot donut chart

Redis/Valkey

  • Evictions by cache Dot categorical chart

    Number of keys that have been evicted due to max memory limit.

  • Hits and misses by cache Dot categorical chart

    Number of successful and unsuccessful key lookups in the cache.

  • Successful read request latency Dot line chart

  • Successful write request latency Dot line chart

  • Network bytes in (host) Dot line chart

  • Network bytes out (host) Dot line chart

  • CPU utilization (host) Dot line chart

    The percentage of CPU utilization for the entire host.

  • Freeable memory (host) Dot line chart

    The amount of free memory available on the host.

  • Engine CPU utilization Dot line chart

Host-level metrics

  • Bytes used Dot line chart

  • Total network bytes in (host) Dot single value

  • Total network bytes out (host) Dot single value

  • Average successful read request latency Dot single value

  • Average successful write request latency Dot single value

  • Network bytes out Dot line chart

  • Network bytes in Dot line chart

  • Total network bytes in Dot single value

  • Total network bytes out Dot single value

  • Unused memory Dot line chart

  • Engine memory usage Dot line chart

  • Bytes used Dot line chart

  • Cache hit rate Dot line chart

    Efficiency of the cache instance. If the cache ratio is lower than about 0.8, it means that a significant number of keys are evicted, expired, or don't exist.

  • Cache hit rate Dot line chart

    Efficiency of the cache instance. If the cache ratio is lower than about 0.8, it means that a significant number of keys are evicted, expired, or don't exist.

AWS ELB

Monitor ALB, Classic, and NLB load balancers. Track 4xx/5xx error rates, target health, and response times to identify unhealthy backends or load balancers under excessive error load.

The AWS ELB dashboard contains the following sections and tiles:

AWS Application Load Balancer

  • Target 4xx responses Dot single value

  • Target 5xx responses Dot single value

  • ALB target error rate Dot single value

    Percentage of errors generated by the targets in a given period.

  • Target error and successful requests by load balancer Dot categorical chart

  • ELB 4xx responses Dot single value

  • ALB error rate Dot single value

    Percentage of errors hat originate from the load balancer in a given period.

  • ELB 5xx responses Dot single value

  • ELB error and successful requests by load balancer Dot categorical chart

  • Target response time Dot line chart

    The time elapsed, in seconds, after the request leaves the load balancer until the target starts to send the response headers through time in a given period.

  • Requests Dot line chart

    The number of requests processed over IPv4 and IPv6 through time in a given period. This metric is only incremented for requests where the load balancer node was able to choose a target. Requests that are rejected before a target is chosen are not reflected in this metric.

  • Healthy and unhealthy hosts by load balancer Dot categorical chart

  • ALB unhealthy rate Dot single value

    Percentage of targets that are considered unhealthy in a given period.

  • Healthy hosts Dot single value

  • Unhealthy hosts Dot single value

  • Active connections Dot single value

  • New connections Dot single value

  • Total processed bytes Dot single value

  • Consumed capacity units Dot single value

Errors

  • CLB backend error rate Dot single value

    Percentage of HTTP response codes generated by registered instances in a given period.

  • Backend 4xx responses Dot single value

  • Backend 5xx responses Dot single value

  • Backend error and successful requests by load balancer Dot categorical chart

  • Requests Dot line chart

    The number of requests completed or connections made during the specified interval through time in a given period.

  • Backend connection errors Dot line chart

    The number of connections that were not successfully established between the load balancer and the registered instances through time in a given period.

  • Backend connection errors by load balancer Dot categorical chart

  • ELB 4xx responses Dot single value

  • ELB 5xx responses Dot single value

  • Latency Dot line chart

    The total time elapsed, in seconds, from the time the load balancer sent the request to a registered instance until the instance started to send the response headers through time in a given period.

  • CLB unhealthy rate Dot single value

    Percentage of unhealthy instances registered with your load balancer in a given period.

  • Healthy hosts Dot single value

  • Unhealthy hosts Dot single value

  • Healthy and unhealthy hosts by load balancer Dot categorical chart

    Distribution of healthy and unhealthy instances registered with your load balancer in a given period.

AWS Network Load Balancer

  • Total processed bytes Dot single value

    The total number of bytes processed by the load balancer, including TCP/IP headers in a given period. This count includes traffic to and from targets, minus health check traffic.

  • Consumed capacity units Dot single value

    The number of load balancer capacity units (LCU) used by your load balancer in a given period.

  • Active flows Dot single value

    The total number of concurrent flows (or connections) from clients to targets in a given period.

  • New flows Dot single value

    The total number of new flows (or connections) established from clients to targets in a given period.

  • NLB unhealthy rate Dot single value

    Percentage of targets that are considered unhealthy in a given period.

  • Healthy hosts Dot single value

  • Unhealthy hosts Dot single value

  • Healthy and unhealthy hosts by load balancer Dot categorical chart

  • Total TCP target resets Dot single value

    The total number of reset (RST) packets sent from a target to a client in a given period. These resets are generated by the target and forwarded by the load balancer.

  • Total TCP ELB resets Dot single value

    The total number of reset (RST) packets generated by the load balancer in a given period.

  • Total TCP client resets Dot single value

    The total number of reset (RST) packets sent from a client to a target in a given period. These resets are generated by the client and forwarded by the load balancer.

AWS Elastic Load Balancing

  • Elastic load balancers Dot single value

AWS EventBridge

Track EventBridge event flow and reliability. Monitor matched events, invocation attempts, and ingestion-to-invocation latency to identify delivery delays or failures.

The AWS EventBridge dashboard contains the following sections and tiles:

AWS EventBridge

  • Ingestion to invocation start latency Dot line chart

    The time to process events, measured from when an event is ingested by EventBridge to the first invocation of a target.

  • Invocation attempts Dot line chart

    Number of times EventBridge attempted invoking a target.

  • Active EventBridge instances Dot single value

    Number of all active event buses in the environment.

  • Ingestion to invocation success latency Dot line chart

    The time taken from event ingestion to successful target delivery, using the invocation end time as cutoff.

  • Matched events Dot donut chart

    The number of events that matched with any rule.

  • Triggered rules Dot donut chart

    The number of rules that have run and matched with any event.

  • Throttled rules Dot donut chart

    The number of times rule execution was throttled.

  • Ingestion to invocation complete latency Dot line chart

    The time taken from event ingestion to completion of the first invocation attempt.

  • invocation attempts Dot categorical chart

    Number of times each target EventBus was successfully invoked.

  • Successful invocation attempts Dot single value

    A percentage of times target was successfully invoked.

AWS Foundation Networking

Monitor AWS NAT Gateway connection status and PrivateLink endpoint performance. Track active connections, packet flows, and port allocation errors to diagnose network path issues.

The AWS Foundation Networking dashboard contains the following sections and tiles:

AWS NAT Gateway

  • Active connections Dot single value
  • Connection attempts Dot line chart
  • Established connections Dot line chart
  • Port allocation errors Dot single value
  • Idle timeouts Dot single value
  • Packets drops Dot single value

Bytes received/sent by the Gateway

  • Total bytes received from destination Dot single value

  • Total bytes sent to destination Dot single value

  • Bytes received from destination Dot line chart

    The number of bytes received by the NAT gateway from the destination.

  • Bytes sent to destination Dot line chart

    The number of bytes sent out through the NAT gateway to the destination.

  • Total bytes received from source Dot single value

  • Total bytes sent to source Dot single value

  • Bytes received from source Dot line chart

    The number of bytes received by the NAT gateway from clients in your VPC.

  • Bytes sent to source Dot line chart

    The number of bytes sent through the NAT gateway to the clients in your VPC.

Packets received/sent by the Gateway

  • Total packets received from destination Dot single value
  • Total packets sent to destination Dot single value
  • Packets received from destination Dot line chart
  • Packets sent to destination Dot line chart
  • Total packets received from source Dot single value
  • Total packets sent to source Dot single value
  • Packets received from source Dot line chart
  • Packets sent to source Dot line chart

AWS PrivateLink

  • Bytes processed Dot line chart

    The number of bytes exchanged between endpoint services and endpoints, in both directions.

  • Reset packets sent Dot line chart

AWS Foundation Networking

  • Percentage of established connections through NAT gateways Dot single value

    The percentage established connections made through the NAT gateway in a given period.

  • NAT gateways Dot single value

    Number of NAT Gateways in the environment.

Consumers - Interface or Gateway LB endpoints

  • Bytes processed Dot line chart

    The number of bytes exchanged between endpoints and endpoint services, aggregated in both directions. This is the number of bytes billed to the owner of the endpoint.

  • Packets dropped Dot line chart

  • PrivateLink connections Dot single value

    The number of endpoints connected to all endpoint services.

  • Active connections by endpoint service ID Dot categorical chart

  • Active connections by service name Dot bar chart

  • Reset packets received Dot line chart

AWS Health Events

View account-specific and public AWS health events by region, account, and service. Filter between event categories to quickly assess the impact of AWS service disruptions on your environment.

The AWS Health Events dashboard contains the following sections and tiles:

  • Total events Dot single value
  • Account-specific health events by region Dot pie chart
  • Health events by account Dot pie chart
  • Account-specific health events Dot table
  • Account-specific health events by service Dot pie chart

AWS Health events

  • Events by status Dot categorical chart
  • Events by category Dot categorical chart

Public events

  • Events by category Dot categorical chart
  • Events by status Dot categorical chart
  • Total events Dot single value

AWS Lambda

Monitor Lambda function invocations, error rates, duration, and concurrency. View per-function error counts to identify failing functions and track execution trends over time.

The AWS Lambda dashboard contains the following sections and tiles:

Usage and performance

  • Concurrent executions Dot line chart

    Number of function instances that are actively processing events at given time.

  • Duration Dot line chart

    The amount of time that function code spends processing an event - does not include cold start time.

AWS Lambda

  • Errors Dot line chart

    Time series of invocations that result in a function error.

  • Function invocations and error count Dot categorical chart

    The invocations count in comparison to the invocation that resulted in an error.

  • Errors % Dot single value

    Percentage value of invocations that resolved in errors for every Lambda function that fits filtering.

  • Throttles Dot single value

    Percentage value of execution of Lambda functions to that were limiting to prevent overwhelming the function.

  • Errors Dot single value

    Count of invocations that resolved in errors for every Lambda function that fits filtering.

  • Invocations Dot single value

    Total number of invocations for every Lambda function that fits filtering.

  • Async events dropped Dot table

    The number of asynchronous events that were dropped without being successfully processed.

  • Throttles Dot line chart

    The number of invocation requests that were throttled because the concurrency limit was exceeded.

  • Post runtime extensions duration Dot table

    The time spent by Lambda Extensions to complete final tasks, after your function's code has finished executing.

AWS Managed Streaming for Apache Kafka

Track throughput, replication health, and connection status for MSK Kafka clusters. Monitor bytes in/out per second and messages per second to detect bottlenecks or replication lag.

The AWS Managed Streaming for Apache Kafka dashboard contains the following sections and tiles:

Throughput

  • Bytes in per second Dot line chart
  • Bytes out per second Dot line chart
  • Messages in per second Dot line chart
  • Average Bytes In Dot single value
  • Average Bytes Out Dot single value
  • Client connections Dot single value
  • Clusters Dot single value

Health

  • Active controller count Dot line chart
  • Partitions per broker Dot line chart

Replication

  • Replication bytes in per second Dot line chart
  • Replication bytes out per second Dot line chart
  • Offline partitions count Dot line chart
  • Max offset lag Dot single value
  • Estimated max time lag Dot single value
  • Sum offset lag Dot line chart
  • Network Rx errors Dot line chart
  • Network Tx errors Dot line chart
  • CPU system Dot line chart
  • CPU user Dot line chart
  • Total connections Dot single value

Performance

  • Under replicated partitions Dot line chart

AWS Overview

High-level view of EC2 instances alongside CloudWatch logs and service problems. See instance distribution by type, availability zone, and account, plus network I/O trends.

The AWS Overview dashboard contains the following sections and tiles:

  • Top 10 EC2 instance types Dot categorical chart
  • Top 10 Availability zones running EC2 instances Dot categorical chart
  • Network: EC2 instances by Network in (bytes) Dot line chart
  • Network: EC2 instances by Network out (bytes) Dot line chart

Other compute resources

  • Top 10 AWS accounts with EC2 instances Dot categorical chart
  • EC2 instances Dot single value
  • EKS Clusters Dot single value
  • Auto scaling groups Dot single value
  • Top 10 accounts with EKS clusters Dot categorical chart
  • Top 10 AWS accounts with Autoscaling groups Dot categorical chart
  • 5xx errors Dot line chart
  • Desired Capacity Dot line chart
  • Cloud Watch error logs by service Dot bar chart
  • EC2 CPU utilization Dot honeycomb chart
  • Active problems Dot single value
  • In Service Instances Dot line chart
  • Latest logs Dot table

Problems

  • Active problem details Dot pie chart

Non compute resources

  • Databases Dot pie chart
  • Storage and File System Dot pie chart
  • Serverless Dot pie chart
  • Networking and Content Delivery Dot pie chart

AWS overview

  • Problems by region Dot pie chart

ECS clusters

  • Memory Utilization Dot line chart
  • CPU Utilization Dot line chart
  • Top 10 accounts with ECS clusters Dot categorical chart
  • ECS Services Dot single value
  • 4xx errors Dot line chart

AWS RDS

Analyze RDS instance storage, network throughput, and query latency. Monitor read and write latency, free storage space, and swap usage to detect performance degradation early.

The AWS RDS dashboard contains the following sections and tiles:

  • Swap usage Dot line chart
  • Network transmit throughput Dot line chart

Network

  • Write latency Dot line chart

  • Read latency Dot line chart

  • Free storage space Dot line chart

  • Freeable memory Dot line chart

    The amount of available random access memory.

Latency

  • Database instances Dot single value
  • Database instances by class Dot categorical chart
  • Database instances by engine Dot categorical chart
  • CPU utilization Dot line chart
  • Database connections Dot line chart
  • Network receive throughput Dot line chart
  • Average read latency Dot single value
  • Average write latency Dot single value
  • Average network receive throughput Dot single value
  • Average network transmit throughput Dot single value

AWS Aurora

  • Volume bytes used Dot line chart

  • Read IO operations Dot table

    The number of billed read I/O operations from a cluster volume within a 5-minute interval.

  • Write IO operations Dot table

    The number of write disk I/O operations to the cluster volume, reported at 5-minute intervals.

AWS S3

Identify S3 buckets with high error rates relative to their request volume. Track 4xx and 5xx errors, request counts by bucket, and latency trends.

The AWS S3 dashboard contains the following sections and tiles:

Usage

  • S3 buckets Dot single value

  • Request count by bucket Dot categorical chart

  • Error rate by bucket Dot categorical chart

  • 4xx errors Dot line chart

  • 5xx errors Dot line chart

  • Request latency Dot line chart

    The elapsed per-request time from the first byte received to the last byte sent to an Amazon S3 bucket.

  • Bytes downloaded Dot line chart

  • Bytes uploaded Dot line chart

AWS S3

  • Total bytes downloaded Dot single value

  • Total bytes uploaded Dot single value

  • First byte latency Dot line chart

    The elapsed per-request time from the first byte received to the last byte sent to an Amazon S3 bucket.

  • Average request latency Dot single value

    The elapsed per-request time from the first byte received to the last byte sent to an Amazon S3 bucket.

  • Average first byte latency Dot single value

    The elapsed per-request time from the first byte received to the last byte sent to an Amazon S3 bucket.

  • Total 4xx error count Dot single value

  • Get request Dot line chart

  • Total count of GET requests for all buckets Dot single value

  • Total 4xx error count Dot single value

Requests

  • Head requests count Dot line chart

    The number of HEAD requests made for objects in an S3 bucket.

  • All requests count Dot single value

AWS SNS

View the delivery status of SNS notifications, including total published messages, failed deliveries, and filtered-out notifications across topics.

The AWS SNS dashboard contains the following sections and tiles:

AWS SNS

  • Topics Dot single value
  • Messages published Dot single value
  • Notifications failed Dot single value
  • Notifications delivered Dot single value
  • Notifications filtered out Dot single value

Notification status over time

  • Size of published messages by topic Dot categorical chart
  • Messages published Dot line chart
  • Number of subscriptions by topic Dot categorical chart
  • Notifications delivered Dot line chart
  • Notifications filtered out Dot line chart
  • Notifications failed Dot line chart
  • Notifications driven to DLQ Dot line chart
  • SMS success rate Dot line chart

AWS SQS

Track message flow across SQS queues, including sent, received, and deleted message counts. Monitor empty receive rates to detect idle queues or backlog buildup.

The AWS SQS dashboard contains the following sections and tiles:

  • Messages deleted Dot single value
  • Messages sent Dot single value
  • Messages received Dot single value
  • Empty receives Dot single value
  • Messages received Dot line chart

AWS SQS

  • Queues Dot single value

  • Last age of oldest message by queue Dot bar chart

  • Messages sent Dot line chart

  • Empty receives Dot line chart

  • Messages deleted Dot line chart

  • Age of oldest message Dot line chart

    Timeseries of age of the oldest message, per queue.

  • Approximate messages visible Dot line chart

  • Approximate messages not visible Dot line chart

  • Approximate messages delayed per queue Dot line chart

  • Total size of messages by queue Dot categorical chart

Azure Application Gateway

Monitor Azure Application Gateway traffic and reliability. Track total and failed requests, error rates, active connections, and per-gateway throughput trends.

The Azure Application Gateway dashboard contains the following sections and tiles:

Azure Application Gateway

  • Total requests Dot single value

    Total number of requests processed in the selected timeframe and scope.

  • Failed requests Dot single value

    Total number of failed requests in the selected timeframe and scope.

  • Error rate % Dot single value

    Percentage of requests that failed (FailedRequests divided by total ResponseStatus).

  • Current connections Dot single value

    Total active client connections to the gateways at the time of measurement.

  • Throughput (bytes/s) by gateway Dot line chart

    Average data throughput (bytes per second) per gateway over time.

  • Failed requests by gateway Dot line chart

    Failed requests over time per gateway, ranked by total failures.

  • Total requests by gateway Dot line chart

    Total requests over time per gateway, ranked by volume.

  • Healthy hosts Dot bar chart

    Average count of healthy backend hosts per gateway over time.

  • Unhealthy hosts Dot bar chart

    Average count of unhealthy backend hosts per gateway over time.

  • HTTP status distribution Dot line chart

    Responses grouped by HTTP status class (2xx/3xx/4xx/5xx) per gateway over time.

  • Healthy host ratio (%) by gateway Dot line chart

    Percentage of healthy hosts out of all hosts per gateway.

  • Current connections by gateway Dot line chart

    Active client connections per gateway over time, ranked by total.

  • HTTP 4xx by gateway Dot line chart

    Client error responses (HTTP 4xx) per gateway over time.

  • HTTP 5xx by gateway Dot line chart

    Server error responses (HTTP 5xx) per gateway over time.

  • Throughput by resource group Dot line chart

    Average data throughput per resource group over time.

  • Failed requests (total) Dot line chart

    Trend of failed requests across the selected scope and timeframe.

  • Error rate % by gateway Dot line chart

    Percentage of failed requests per gateway over time.

Azure Blob Storage

Identify Blob Storage containers with high error rates. Track transactions, ingress and egress, and E2E and server latency to detect availability or performance issues.

The Azure Blob Storage dashboard contains the following sections and tiles:

Azure Blob Storage

  • Transactions (blob service) Dot line chart

  • Egress Dot line chart

  • Ingress Dot line chart

  • Successful E2E latency Dot line chart

  • Successful server latency Dot line chart

  • Blob capacity Dot categorical chart

  • Container count Dot single value

    Containers are organizers for a set of blobs.

Throughput and Workloads

  • Transactions (blob service) Dot table
  • Blob count Dot single value

Blob availability

  • Blob capacity Dot categorical chart

Usage

  • Average Blob availability Dot single value
  • Blob availability and count by resource Dot table

Azure Cache for Redis

Monitor Azure Cache for Redis instance performance. Track connected clients, command throughput, cache hit ratio, average latency, and server load to detect slowdowns or connection pressure.

The Azure Cache for Redis dashboard contains the following sections and tiles:

Azure Cache for Redis

  • Connected clients Dot line chart

  • Total commands processed Dot single value

  • Total cache hits Dot single value

  • Average latency Dot single value

  • Server load Dot line chart

    The percentage of cycles in which the Redis server is busy processing and not waiting idle for messages

  • Processor Time Dot line chart

    The CPU utilization of the Azure Redis Cache server as a percentage

  • Errors Dot line chart

  • Cache read Dot line chart

  • Latency P99 Dot line chart

  • Server latency Dot line chart

  • Cache write Dot line chart

  • Expired keys Dot line chart

  • Evicted keys Dot line chart

  • Used memory Dot line chart

  • Total keys Dot line chart

Performance

  • Cache hits Dot line chart
  • Instances Dot single value
  • Total cache misses Dot single value
  • Cache misses Dot line chart

Azure Container Apps

Monitor CPU, memory, and network utilization for Azure Container Apps. View HTTP error trends and active replica counts per resource to detect overloaded or failing container apps.

The Azure Container Apps dashboard contains the following sections and tiles:

Azure Container Apps

  • HTTP Errors by Resource Dot line chart

Network Health

  • Requests count Dot single value
  • Active Replicas by Resource Dot line chart
  • Tx Total Dot single value
  • Max Requests by Resource Dot line chart

Infrastructure Health

  • Replica Restarts by Resource Dot table
  • CPU Utilization [%] Dot line chart
  • Memory Utilization [%] Dot line chart
  • Total HTTP Errors Dot single value
  • HTTP Error Rate Dot single value
  • Received Bytes Dot table

HTTP Insights

  • HTTP 5xx Errors Dot single value
  • HTTP 4xx Errors Dot single value
  • Pending Connection Pool Requests Dot bar chart
  • Average Latency Dot single value
  • Transmitted Bytes Dot table
  • Request Retries Dot line chart
  • Rx Total Dot single value
  • HTTP 4xx Errors by Resource Dot line chart
  • HTTP 5xx Errors by Resource Dot line chart
  • Latency by Resource Dot line chart

Azure Files

Check availability, throughput, and capacity for Azure Files shares. View per-resource availability and file counts, and track latency and I/O performance over time.

The Azure Files dashboard contains the following sections and tiles:

  • Blob availability and count by resource Dot table
  • Container count Dot single value
  • Files count Dot single value
  • Average availability Dot single value

Capacity & Quotas

  • File capacity Dot categorical chart
  • File share capacity quota Dot line chart
  • File count Dot single value
  • File share count Dot single value
  • Availability by resource Dot bar chart

Performance

  • Successful server latency Dot line chart
  • Successful E2E latency Dot line chart

Throughput

  • Egress Dot line chart
  • Ingress Dot line chart
  • Ingress & egress table Dot table

Workload

  • Transactions table Dot table
  • Transactions by resource Dot bar chart

Azure Functions

Monitor Azure Function App execution units, error rates, and network I/O. Identify failing function apps and track 5xx error trends to detect reliability regressions.

The Azure Functions dashboard contains the following sections and tiles:

Usage and performance

  • Execution Units Dot line chart

    Combines execution time and memory usage into “execution units,” useful for estimating resource consumption and optimizing memory allocation

Azure Function Apps

  • 5xx errors Dot line chart

  • Errors % Dot single value

    Considering 4xx and 5xx errors for all requests

  • Errors Dot single value

    Considering 4xx and 5xx errors for all requests

  • Bytes received vs bytes sent Dot categorical chart

  • Memory Working Set Dot line chart

    Amount of memory used by the Function App process

  • Average response time Dot single value

  • Functions Dot single value

  • Response time Dot line chart

  • 4xx errors Dot line chart

  • Executions vs Requests Dot categorical chart

    Requests which ended in any function execution vs all of these requests (considering correct executions, failures and rejections)

  • 5xx errors Dot single value

  • Requests Dot single value

    All requests (considering correct executions, failures and rejections)

  • Executions Dot single value

    Out of all the incoming requests, the count of those which ended in any function execution

  • 4xx errors Dot single value

Azure Load Balancer

View VIP and DIP availability across Azure Load Balancer resources. Track packet counts and availability trends to identify degraded load balancer endpoints.

The Azure Load Balancer dashboard contains the following sections and tiles:

Azure Load Balancer

  • Average VIP availability Dot single value

    Average data path availability to the front-end (VIP) across the selected timeframe.

  • Average DIP availability Dot single value

    Average backend endpoint (DIP) health across the selected timeframe.

  • VIP availability by load balancer Dot line chart

    Front-end data path availability (VIP) per load balancer over time.

  • DIP availability by load balancer Dot line chart

    Backend endpoint health (DIP) per load balancer over time.

  • Packet count by load balancer Dot line chart

    Total packets processed per load balancer over time (Gateway SKU).

  • VIP vs DIP (by resource) Dot table

    Side-by-side view of average VIP and DIP availability per load balancer.

  • VIP availability (overall trend) Dot line chart

    Overall VIP availability trend across the selected scope.

  • DIP availability (overall trend) Dot line chart

    Overall DIP availability trend across the selected scope.

  • Packet count (overall trend) Dot line chart

    Total packets processed across the selected scope (Gateway SKU).

Azure Managed Redis

Overview of instance usage and performance with guidance to identify low performance and potential optimizations through activity.

The Azure Managed Redis dashboard contains the following sections and tiles:

Azure Managed Redis

  • Instances Dot single value
  • Connected clients per instance Dot bar chart

Performance & Latency

  • Operations per second per instance Dot bar chart

  • Server load Dot line chart

    The percentage of cycles in which the Redis server is busy processing and not waiting idle for messages.

  • CPU utilization (percentProcessorTime) Dot line chart

    The CPU utilization of the Azure Redis Cache server as a percentage.

  • Average cache latency Dot single value

  • Cache latency per instance Dot line chart

Usage & Effectiveness

  • Total cache misses Dot single value
  • Total cache hits Dot single value
  • Total operations Dot single value
  • Total evicted keys Dot single value
  • Total expired keys Dot single value
  • Read throughput Dot line chart
  • Used memory percentage Dot line chart
  • Write throughput Dot line chart
  • Used memory Dot line chart

Azure OpenAI

High-level overview of the status, usage, and reliability of your Azure OpenAI resources.

The Azure OpenAI dashboard contains the following sections and tiles:

Azure OpenAI

  • Total tokens Dot single value
  • Instances by kind Dot donut chart

Latency

  • Time to response Dot line chart
  • Availability rate by kind Dot categorical chart

Usage

  • Time to last token Dot line chart
  • Processed prompt tokens Dot line chart
  • Generated tokens Dot line chart
  • Tokens per second Dot line chart
  • Total tokens by model Dot line chart
  • Time to response by model Dot line chart

Azure Overview

High-level view of Azure VM instances alongside Monitor logs and service problems. See instance distribution by size, location, and subscription, plus network I/O trends.

The Azure Overview dashboard contains the following sections and tiles:

  • Top 10 VM instance sizes Dot categorical chart
  • Top 10 locations running VM Dot categorical chart
  • Network: VM Network In Total (bytes) Dot line chart
  • Network: VM Network Out Total (bytes) Dot line chart

Other compute resources

  • Top 10 VM by Azure Subscription Dot categorical chart
  • Azure VM's Dot single value
  • VM Scale Sets Dot single value
  • Top 10 VM Scale Sets subscriptions Dot categorical chart
  • CPU Utilization Dot line chart
  • VM CPU utilization Dot honeycomb chart
  • Active Problems Dot single value

Davis problems

  • Problems by region Dot pie chart
  • Active problem details Dot pie chart
  • Azure Container Apps Dot single value
  • Top 10 Container Apps subscriptions Dot categorical chart
  • CPU usage (nanocores) Dot line chart
  • Network in (bytes) Dot line chart

Non compute resources

  • Databases Dot pie chart
  • Storage Dot pie chart
  • Serverless Dot pie chart
  • Networking Dot pie chart

Azure overview

  • Network in (bytes) Dot line chart

Azure Queue

Track Azure Queue Storage transaction volumes, message I/O, and availability. Monitor ingress, egress, E2E latency, and queue capacity usage across resources.

The Azure Queue dashboard contains the following sections and tiles:

Azure Queue Storage

  • Transactions requests by resource Dot bar chart
  • Egress Dot line chart
  • Ingress Dot line chart
  • Successful E2E latency Dot line chart
  • Successful server latency Dot line chart
  • Queue count Dot single value

I/O

  • Ingress and Egress by resource Dot table
  • Transactions by resource Dot table
  • Queue message count Dot single value
  • Average Queue availability Dot single value

Capacity

  • Queue count Dot line chart
  • Queue count and capacity Dot table
  • Queue capacity Dot line chart
  • Transactions (Success) Dot line chart
  • Transactions Dot categorical chart
  • Transactions (Errors) Dot line chart

Azure SQL Database

Analyze Azure SQL Database CPU usage, storage consumption, active sessions, connections, and deadlocks. Identify capacity problems and unhealthy databases quickly.

The Azure SQL Database dashboard contains the following sections and tiles:

Azure SQL Database

  • CPU usage Dot line chart
  • Storage usage Dot line chart
  • Deadlocks Dot table
  • SQL Databases Dot single value
  • Active sessions Dot line chart
  • Databases by pricing tier Dot categorical chart

Usage

  • SQL Servers Dot single value
  • Databases per server Dot categorical chart

Connections

  • DTU consumption Dot table
  • Workers usage Dot line chart
  • TempDB log space usage Dot table
  • Connection system errors Dot line chart
  • Connection user errors Dot line chart
  • Firewall blocks Dot line chart
  • Available databases Dot donut chart
  • Total active sessions Dot single value
  • Data IO usage Dot line chart

Azure Storage Accounts

Overview of Azure Storage Accounts covering availability, ingress and egress, latency, and throttling across blob, file, queue, and table storage types.

The Azure Storage Accounts dashboard contains the following sections and tiles:

Table Service

  • E2E latency Dot single value
  • Server latency Dot single value
  • Egress Dot single value
  • Ingress Dot single value
  • Transactions Dot single value
  • Availability Dot single value
  • Average Blob availability Dot single value
  • Containers count Dot single value
  • Blob Count Dot single value
  • Blob availability and count by resource (Top $Limit) Dot table
  • Transactions by reponse Dot categorical chart
  • Egress Dot line chart
  • Ingress Dot line chart

Storage Account overview

  • Average Queue availability Dot single value
  • Queue Count Dot single value
  • Availability by Resource Dot table
  • Egress Dot line chart
  • Ingress Dot line chart
  • Availability by Resource Dot table
  • File count Dot single value
  • File Storage availability Dot single value
  • Message count Dot single value
  • Egress Dot line chart
  • Ingress Dot line chart
  • Success Server Latency Dot line chart
  • Successful E2E Latency Dot line chart
  • File share count Dot single value
  • File service transaction by response Dot categorical chart
  • Entities by resource Dot table
  • Table count Dot single value
  • File Storage availability Dot single value
  • Table Entity Count Dot single value
  • Table Capacity by resource Dot table
  • Egress Dot line chart
  • Ingress Dot line chart
  • Table service transaction by response Dot categorical chart
  • Storage count by type Dot line chart
  • Storage capacity by type Dot line chart
  • Blob capacity (Top $Limit) Dot table

Azure Table Storage

Track transaction volumes, throughput, and latency for Azure Table Storage resources. Monitor ingress, egress, and E2E latency to identify slow or error-prone tables.

The Azure Table Storage dashboard contains the following sections and tiles:

Azure Table Storage

  • Transactions by resource Dot bar chart
  • Egress Dot line chart
  • Ingress Dot line chart
  • Successful E2E latency Dot line chart
  • Successful server latency Dot line chart
  • Table capacity Dot categorical chart

Throughput

  • Ingress & egress table Dot table
  • Transactions table Dot table
  • Average availability Dot single value
  • Availability & table count table Dot table
  • Availability by resource Dot bar chart

Capacity & Quotas

  • Table count Dot single value
  • Table entity count Dot single value

Azure Virtual Machine Scale Set

View health, scaling, and resource utilization of Azure Virtual Machine Scale Sets. Track CPU usage per VM instance, disk I/O rates, and total instance counts.

The Azure Virtual Machine Scale Set dashboard contains the following sections and tiles:

Azure—Virtual Machine Scale Sets

  • Total VM Instances Dot single value
  • Average CPU Utilization (%) Dot single value
  • Average Disk Read Ops/Sec Dot single value
  • Average Disk Write Ops/Sec Dot single value
  • CPU Usage (%) by VM Dot line chart
  • Available Memory (%) by VM Dot line chart
  • CPU Credits Remaining by VM Dot line chart
  • Disk Read Operations/Sec Dot line chart
  • Disk Write Operations/Sec Dot line chart
  • Disk Read Throughput (bytes) Dot line chart
  • Disk Write Throughput (bytes) Dot line chart
  • Network In (bytes) Dot line chart
  • Network Out (bytes) Dot line chart
  • Inbound Network Flows Dot line chart
  • Outbound Network Flows Dot line chart
  • OS Disk Latency (ms) Dot line chart
  • OS Disk IOPS Consumed (%) Dot line chart
  • OS Disk Bandwidth Consumed (%) Dot line chart

Azure Virtual Machines

View network, disk, and memory usage for Azure VMs. Track network in/out, disk read/write, I/O operations per second, and available memory across your VM environment.

The Azure Virtual Machines dashboard contains the following sections and tiles:

Microsoft Azure — Virtual Machines

  • Network in (bytes) Dot line chart

  • Disk read (bytes) Dot bar chart

  • Disk I/O operations/sec Dot line chart

  • Disk write (bytes) Dot bar chart

  • Available memory (bytes) Dot line chart

  • Available memory (%) Dot line chart

  • CPU credits consumed Dot line chart

  • CPU credits remaining Dot line chart

  • OS disk latency (ms) Dot line chart

  • Inbound vs outbound flows Dot categorical chart

  • CPU utilization rate for 10 instances with highest usage Dot table

    The most recent percentage of physical CPU time that Amazon EC2 uses to run the EC2 instance, which includes time spent to run both the user code and the Amazon EC2 code.

  • Virtual machine instances per region Dot categorical chart

  • Active Virtual Machines instances Dot single value

  • Network out (bytes) Dot line chart

  • Total Network in (bytes) Dot single value

  • Total Network out (bytes) Dot single value

Scale sets

  • CPU utilization Dot line chart
  • CPU credits remaining Dot bar chart
  • VM availability Dot bar chart
  • Available memory Dot line chart
  • Disk throughput (bytes/sec) Dot line chart
  • Disk IOPS Dot line chart

Performance and usage

  • Avg CPU Utilization Dot line chart
  • CPU Utilization (Top $Limit highest usage) Dot table

Classic AWS overview

Classic view of EC2 instance distribution and CloudWatch logs alongside Davis problems. Shows instance counts by type, availability zone, and account, and network I/O trends.

The Classic AWS overview dashboard contains the following sections and tiles:

  • Top 10 EC2 instance types Dot categorical chart

    Shows the most commonly used EC2 instance types (e.g., t2.micro, t3.nano).

  • Top 10 Availability zones running EC2 instances Dot categorical chart

    Highlights the top 10 AWS availability zones where EC2 instances are deployed.

  • Network: EC2 instances by Network in (bytes) Dot line chart

    Shows the incoming network data (in bytes) for individual EC2 instances over time.

  • Network: EC2 instances by Network out(bytes) Dot line chart

    Shows the outgoing network data (in bytes) for individual EC2 instances over time.

Other compute resources

  • Top 10 AWS accounts with EC2 instances Dot categorical chart

    Shows the top 10 AWS accounts by the number of EC2 instances they are running. Each bar represents an account, with the length indicating the total instances.

  • EC2 instances Dot single value

    Displays the total count of active EC2 instances in the monitored environment.

  • Elastic Kubernetes Services Dot single value

    Shows the count of Elastic Kubernetes Service (EKS) clusters currently active in the AWS environment.

  • Auto scaling groups Dot single value

    Displays the total number of auto-scaling groups available in the AWS environment.

  • Top 10 accounts with EKS clusters Dot categorical chart

    Highlights the top 10 AWS accounts with the most EKS clusters, providing insights into Kubernetes resource distribution.

  • Top 10 AWS accounts with Autoscaling groups Dot categorical chart

    Displays the top 10 AWS accounts with the highest number of auto-scaling groups, sorted by count.

  • Node CPU limit Dot line chart

    Tracks the CPU limit for nodes in the environment, showing the maximum values over time for each node.

  • Desired Capacity Dot line chart

    Monitors the desired capacity of auto-scaling groups, showing trends in the number of instances required to meet scaling policies.

  • Cloud Watch error logs by log level Dot bar chart

    Shows the count of CloudWatch error logs grouped by severity over time.

  • EC2 CPU utilization Dot honeycomb chart

    Represents the CPU utilization of EC2 instances. Each hexagon corresponds to an instance, with colors ranging from green (low utilization) to red (high utilization).

  • Active Problems Dot single value

    Displays the total number of active problems detected by Davis, Dynatrace's AI engine. This chart provides a quick snapshot of the current health of your monitored environment. A value of "0" indicates no active issues requiring attention.

  • In Service Instances Dot table

    Tracks the average number of in-service instances within auto-scaling groups, providing insights into resource availability and usage.

  • Latest logs Dot table

    Displays the most recent CloudWatch logs, including timestamps, log content severity levels, service, account id and region.

Davis problems

  • Problems by region Dot pie chart

    Breaks down active problems by geographic region, helping identify areas with recurring or localized issues.

  • Active problem details Dot pie chart

    Provides information about currently active problems types.

  • Memory limit Dot line chart

    Tracks the maximum memory limits for nodes in the environment.

Non compute resources

  • Databases Dot pie chart

    Visualizes the distribution of various database services in the environment, such as Amazon RDS, DynamoDB, Aurora, and others.

  • Storage and File System Dot pie chart

    Represents the usage of storage services such as Amazon S3, EBS, EFS, and FSx.

  • Serverless Dot pie chart

    Displays the count of serverless resources such as AWS Lambda, EventBridge, Step Functions, and API Gateway.

  • Networking and Content Delivery Dot pie chart

    Highlights the usage of networking and content delivery resources like Elastic Load Balancers, Amazon CloudFront, and Route 53.

Classic Azure overview

Classic view of Azure VM distribution and Monitor logs alongside Davis problems. Shows instance counts by size, region, and subscription, and network I/O trends.

The Classic Azure overview dashboard contains the following sections and tiles:

  • Top 10 VM Sizes Dot categorical chart

    Lists the top 10 most commonly used VM sizes in the environment.

  • Top 10 regions with Azure VM Dot categorical chart

    Displays the top 10 Azure regions hosting the highest number of VMs.

  • Network: VM Network In Total (bytes) Dot line chart

    Tracks the total incoming network traffic (in bytes) for Azure VMs over time.

  • Network: VM Network Out Total (bytes) Dot line chart

    Tracks the total outgoing network traffic (in bytes) for Azure VMs over time.

Other compute resources

  • Top 10 VM by Azure Subscription Dot categorical chart

    Highlights the top 10 Azure subscriptions hosting the highest number of VMs.

  • Azure VM's Dot single value

    Displays the total number of active Azure Virtual Machines (VMs) in the environment.

  • Azure Kubernetes Services Dot single value

    Displays the total number of active Azure Kubernetes Service (AKS) clusters in the environment.

  • VM Scale Sets Dot single value

    Displays the total number of active Azure Virtual Machine Scale Sets (VMSS) in the environment.

  • Top 10 AKS subscriptions Dot categorical chart

    Highlights the top 10 Azure subscriptions hosting the most AKS clusters.

  • Top 10 VM Scale Sets subscriptions Dot categorical chart

    Highlights the top 10 Azure subscriptions hosting the highest number of VM Scale Sets.

  • Azure Kubernetes Service: Memory available / cluster Dot line chart

    Monitors the memory (in GB) available per AKS cluster over time.

  • VM Scale Sets CPU Utilization Dot line chart

    Tracks the CPU utilization percentage for each VM Scale Set over time.

  • Azure Monitor error logs by service Dot line chart

    Displays a breakdown of Azure Monitor error logs grouped by service.

  • VM CPU utilization Dot honeycomb chart

    A hexagonal heatmap representing CPU utilization across Azure VMs. Each hexagon corresponds to a VM, with colors indicating the level of CPU usage, from low (green) to high (red).

  • Active Problems Dot single value

    Displays the total number of active problems detected by Davis, Dynatrace's AI engine. This chart provides a quick snapshot of the current health of your monitored environment. A value of "0" indicates no active issues requiring attention.

  • VM Scale Sets Network In Total (bytes) Dot line chart

    Monitors the total incoming network traffic (in bytes) for VM Scale Sets over time.

  • Latest logs Dot table

    Shows the most recent logs collected from Azure Monitor.

Davis problems

  • Problems by region Dot pie chart

    Breaks down active problems by geographic region, helping identify areas with recurring or localized issues.

  • Active problem details Dot pie chart

    Provides information about currently active problems types.

  • Azure Container Apps Dot single value

    Displays the total number of Azure Container Apps in the environment.

  • Top 10 Container Apps subscriptions Dot categorical chart

    Highlights the top 10 Azure subscriptions hosting the most container apps.

  • Azure Kubernetes Service: CPU cores available / cluster Dot line chart

    Tracks the number of CPU cores available per AKS cluster over time.

  • Container apps CPU usage / resource Dot line chart

    Tracks the CPU usage for individual Azure Container Apps over time.

  • Container apps Network in bytes / resource Dot line chart

    Monitors the incoming network traffic (in bytes) for individual Azure Container Apps over time.

Non compute resources

  • Databases Dot pie chart

    Displays the distribution of database resources in the environment, such as Azure SQL Server.

  • Storage Accounts Dot pie chart

    Represents the total number of Azure Storage Accounts in the environment.

  • Serverless Dot pie chart

    Visualizes the distribution of serverless resources, such as App Service Plans, Function Apps, Web App Deployment Slots, and Web Apps.

  • Network devices Dot pie chart

    Displays data about network devices in the environment. Currently, no records are available.

Dashboards Dashboards

Explore ready-made dashboards owned by Dashboards Dashboards.

Getting started with Dashboards

Hands-on introduction to dashboards with live examples. Explore visualization types including line charts, maps, heatmaps, and scatter plots using sample data sets.

The Getting started with Dashboards dashboard contains the following sections and tiles:

Read

  • Trends in motion Dot line chart
  • F1 races Dot dot map
  • Coffee cups vs. Git commits per day Dot scatterplot chart
  • Observability spent (in Billion USD) by industry Dot heatmap chart

Spot trends

  • Observability spent details Dot table

Get started with Dashboards

  • Cloud migration statistics by industry Dot categorical chart

Databases Services Classic Databases app

Explore ready-made dashboards owned by Databases Services Classic Databases app.

Databases overview

Overview of monitored databases by vendor and health status. View total database service counts, services with active problems, and health distribution across your database environment.

The Databases overview dashboard contains the following sections and tiles:

  • Database instances by vendor Dot pie chart

    Vendors of Extensions Framework 2.0–monitored database instances.

Database Availability and Health

  • Database services health Dot honeycomb chart

    Databases with/without active Davis problems.

  • Database services Dot single value

    Total amount of calling services.

  • Database services with problems Dot single value

    Amount of database services with active Davis problems.

  • Database services by vendor Dot categorical chart

  • Database instances availability Dot pie chart

    Status of Extensions Framework 2.0–monitored database instances.

  • Total database instances Dot single value

    Total amount of Extensions Framework 2.0–monitored database instances.

  • Database instances with alerts Dot single value

    Extensions Framework 2.0–monitored database instances with potentially problematic availability.

Discovery & Coverage Discovery & Coverage

Explore ready-made dashboards owned by Discovery & Coverage Discovery & Coverage.

ActiveGate diagnostic overview

Monitor memory, storage, JVM GC, and network metrics for your ActiveGate instances. View distribution across network zones and groups to identify resource pressure or unhealthy nodes.

The ActiveGate diagnostic overview dashboard contains the following sections and tiles:

Host vitals

  • Memory Dot line chart
  • Storage Dot line chart

Process

  • JVM GC time Dot line chart

  • ActiveGates per network zone Dot categorical chart

  • ActiveGates per group Dot categorical chart

  • Agent modules connected Dot line chart

  • Network traffic to/from clients Dot line chart

  • Network traffic to/from Dynatrace environment Dot line chart

    Note: Chart is not drawn when no errors reported within timeframe

Networking

  • REST.API calls Dot line chart
  • REST.API errors Dot line chart
  • Request size Dot line chart
  • Directory quotas Dot line chart
  • CPU usage Dot line chart
  • CPU usage Dot line chart
  • Memory Dot line chart
  • Thread pool busy threads Dot line chart
  • Thread pool queues sizes Dot line chart
  • Dropped, resent & rejected messages Dot line chart

REST.API

  • Response size Dot line chart

Distributed Tracing Distributed Tracing

Explore ready-made dashboards owned by Distributed Tracing Distributed Tracing.

Full-Stack Adaptive Traffic Management and trace capture

Get visibility into Full-Stack trace volumes and Adaptive Traffic Management. Monitor OneAgent capture rates, average span sizes, and estimate extended trace ingest costs.

The Full-Stack Adaptive Traffic Management and trace capture dashboard contains the following sections and tiles:

  • Full-Stack OneAgent capture rates Dot line chart

    The request capture rate represents the ratio between captured requests and the total number of transactions processed by OneAgent monitored application or host. In this chart, the blue line shows the trace capture rate and the red line shows the request capture rate over time. The metrics require at least OneAgent version 1.305.

  • Full-Stack trace data volume Dot line chart

    Amount of trace data ingested from Full-Stack monitored applications or hosts. The chart includes * The trace data volume captured by OneAgent and regulated by Adaptive Traffic Management (green bars). * The Full-Stack included trace volume based on the contributing Full-Stack memory-gibibytes (blue line). * The trace data volume ingested from Full-Stack monitored applications or hosts but not regulated by Adaptive Traffic Management (fullstack-fixed-rate-ingested_bytes_sum; blue bar). This includes OpenTelemetry spans and other fixed-rate traffic and can exceed the included limit; the excess will be charged.

  • Full-Stack trace volume Dot bar chart

    Dynatrace ingests trace data from multiple sources, which are licensed differently. * fullstack-adaptive shows trace data captured by OneAgent on Full-Stack monitored hosts and applications and regulated by Adaptive Traffic Management. * fullstack-fixed-rate shows trace data from Full-Stack monitored sources that use fixed-rate sampling (for example, OpenTelemetry spans or fixed-rate OneAgent settings). This traffic still consumes the Full-Stack included trace volume but is not automatically adjusted by Adaptive Traffic Management and can exceed the included limit, with the excess billed as Traces – Ingest & Process. * Other series (for example, serverless or OTLP-only sources) represent trace data that is not part of Full-Stack Monitoring and is not controlled by Adaptive Traffic Management. On this chart, green bars represent fullstack-adaptive, blue bars represent fullstack-fixed-rate, magenta bars represent otlp-trace-ingest, and red bars represent serverless trace data.

  • Full-Stack trace volume used Dot line chart

    Ingested trace volume, as a percentage of your licensed Full-Stack included trace volume. Adaptive Traffic Management keeps it around the Full-Stack included limit. The algorithm used in Dynatrace accounts for a degree of fluctuation, allowing the used trace volume to exceed 100% without extra charges * This can exceed 100% if you opted for Extended trace ingest on top of Full-Stack Monitoring , this excess will be charged. * This can exceed 100% if you sent OpenTelemetry traces or other fixed rate span data via API from Full-Stack monitored sources, this excess will be charged.

  • Average size of Full-Stack spans Dot line chart

    Average size of spans ingested from Full-Stack monitored applications or hosts. Typical values are in the 1.5-2 KiB range; if the span size is larger and the used trace volume is high (or the trace capture rate is low), you might be capturing a lot of data per span. In this chart, the green line shows spans from adaptive Full-Stack trace ingest (fullstack-adaptive), and the blue line shows spans from fixed-rate Full-Stack trace ingest (fullstack-fixed-rate).

  • Contributing Full-Stack memory-gibibyte Dot line chart

    Contributing Full-Stack memory-gibibytes from monitored hosts and applications. The blue line (contributing_gib) is derived from dt.billing.full_stack_monitoring.usage and normalized to represent contributing GiB per hour, matching the DPS Full-Stack Monitoring billing usage. This value is used to calculate your Full-Stack included trace volume (200 KiB of trace data per minute, or 3000 KiB per 15-minute interval, for each contributing GiB).

  • Adaptive trace volume per contributing memory-gibibytes per minute Dot area chart

    Average adaptive trace volume every 15 minutes (trace_volume_per_gibh; green area). Full-Stack Monitoring starts from 200 KiB/min of trace volume per contributing GiB (3000 KiB per 15-minute interval), which is highlighted by the threshold line in the chart.

  • Fixed rate trace volume per contributing memory-gibibytes per minute Dot area chart

    Average fixed-rate trace volume every 15 minutes for fixed-rate Full-Stack traces (trace_volume_per_gibh; blue area). This helps you compare fixed-rate trace volume per contributing GiB with the default 200 KiB/min (3000 KiB per 15-minute interval) included with Full-Stack Monitoring; thresholds highlight when volume per GiB approaches or exceeds this level.

  • Full-Stack trace ingest and billable extended ingest Dot line chart

    The relationship between the amount of ingested trace data (included_ingested_byte_sum; green bar) and the Full-Stack included trace volume (included_limit; blue line). If you opted for Extended trace ingest for Full-Stack Monitoring, Adaptive Traffic Management adjusts trace ingest against the configured limit (configured_limit; red line), and the extended trace volume charged via Traces – Ingest & Process is shown as billingAmount (orange bar).

Extended trace ingest for Full-Stack Monitoring

  • Full-Stack extended trace ingest calculator Dot line chart

    Use this chart to simulate Extended trace ingest for Full-Stack Monitoring. Set the ExtraIngestFactor dashboard variable to specify how many times above the Full-Stack included trace volume you want to configure. The chart shows the Full-Stack included limit (included_limit; blue line), the predicted configured limit (predicted_configured_limit; red line), the trace volume covered by the included limit (included_ingested_byte_sum; green bar), and the predicted billable extended ingest (predicted_billing_amount; orange bar). The current ExtraIngestFactor is $ExtraIngestFactor.

  • Predicted extended ingest billable amount Dot single value

    Total predicted Extended trace ingest that will be billed for the selected timeframe, based on the configured ExtraIngestFactor.

Dynatrace Assist

Explore ready-made dashboards owned by Dynatrace Assist.

Generative AI feature adoption

Track adoption of Dynatrace generative AI features on your tenant. View unique active users, query execution details, usage by skill, and interaction failure rates.

The Generative AI feature adoption dashboard contains the following sections and tiles:

  • Query execution details Dot table

  • Number of unique users Dot single value

    The number of individual users who have interacted with any of the generative AI functionalities over the selected time frame.

  • Most active users Dot categorical chart

    Top 10 most active users in the selected time frame.

Interaction success rate

  • Failed NL2DQL interaction details Dot table

    Tip: Try out Open with… > Davis CoPilot on the "response" column to understand why the generated DQL is considered invalid

  • Usage breakdown by skill Dot table

  • Query executions by app Dot categorical chart

  • Number of unique users over time Dot line chart

  • Query executions by app over time Dot line chart

Response error details

  • Failed chat interaction details Dot table

  • Most frequently asked about topics in Davis CoPilot Chat Dot table

    Davis CoPilot automatically generates a high-level topic for each prompt. This table provides and overview of the top 50 topics that are asked organically, from embedded app prompts, and via the Davis CoPilot workflow action.

  • DQL2NL issues Dot table

  • Success rate over time Dot line chart

  • Execution duration over time Dot line chart

    Average duration of how long it take to execute user prompts by skill, and how this develops over time.

  • Recent question details: Davis CoPilot Chat Dot table

    Recent organic questions being asked in the Davis CoPilot chat. This excludes prompts embedded in apps, and excludes workflow action prompts.

  • Recent question details: Quick Analysis Dot table

Davis CoPilot Feature Adoption Dashboard

  • Feedback details Dot table

  • Total chat invocations Dot single value

  • Usage by skill Dot categorical chart

    Number of successful and unsuccessful skill invocations (interactions with different functionalities).

  • Success rate by skill Dot categorical chart

  • NL2DQL issues Dot donut chart

  • Chat issues Dot donut chart

  • DQL2NL issues Dot donut chart

  • Most recent question: Davis CoPilot Chat Dot single value

  • Most recent question: Quick Analysis Dot single value

Performance

  • Topics triggering guardrails in Davis CoPilot Chat Dot table

    Davis CoPilot automatically generates a high-level topic for each prompt. This table provides and overview of the top 50 topics that are asked organically, from embedded app prompts, and via the Davis CoPilot workflow action.

Chat feedback

  • NL2DQL Feedback Rate Dot single value
  • Chat Feedback Rate Dot single value
  • Invocations with feedback Dot single value
  • Feedback distribution Dot categorical chart
  • Invocations with feedback Dot single value

DQL2NL feedback

  • Invocations with feedback Dot single value

  • DQL2NL Feedback Rate Dot single value

  • Total NL2DQL invocations Dot single value

  • Total DQL2NL invocations Dot single value

  • Embedded chat prompts Dot categorical chart

    Overview of usage of embedded conversation starters: copilot-conv-starters. This is a sub-section of "DAVIS COPILOT" usage in the charts to the left of this one.

  • Negative feedback breakdown Dot categorical chart

Prompt details

  • Usage by app Dot line chart

    Breakdown of all invocations by the primary app in which the skill is integrated. "Davis CoPilot" refers to the chat app.

  • Usage by skill Dot line chart

    Breakdown of all invocations by skill across all apps.

  • Usage by app Dot donut chart

    Breakdown of all invocations by the primary app in which the skill is integrated. "Davis CoPilot" refers to the chat app.

  • Usage by skill Dot donut chart

    Breakdown of all invocations by skill across all apps.

Experience Vitals Experience Vitals

Explore ready-made dashboards owned by Experience Vitals Experience Vitals.

Digital Experience retain and query usage

Track Digital Experience data retention volumes and query usage. View daily query volume by app, retained data across buckets, and total query counts by timeframe.

The Digital Experience retain and query usage dashboard contains the following sections and tiles:

  • Daily query volume by app Dot bar chart
  • Total retained DEM data volume (across all buckets) Dot single value

Digital Experience retain and query usage details

  • DEM query count by timeframe Dot bar chart
  • Average daily query volume Dot single value
  • Retained DEM data volume by bucket Dot donut chart
  • Retained user events by event type (last 5 min) Dot donut chart
  • Query volume % by bucket Dot donut chart
  • Daily query volume by dashboard & notebook Dot bar chart
  • DEM query volume by timeframe Dot bar chart
  • Query volume % by app Dot donut chart
  • Average daily query count Dot single value
  • Average daily retained data volume by bucket Dot donut chart
  • Average daily query users Dot single value

Frontend resource analysis

Investigate loaded frontend resources by performance and size. View decoded, download, and encoded size by resource asset, and identify resources contributing most to page load time.

The Frontend resource analysis dashboard contains the following sections and tiles:

  • Number of Resource Assets by Page/View - p$Percentile Dot single value

Resource Asset

  • Resources by Page Grouped by Page/View Dot table
  • Decoded Size by Resource Asset - p$Percentile Dot pie chart
  • Download Size by Resource Asset - p$Percentile Dot pie chart
  • Encoded Size by Resource Asset - p$Percentile Dot pie chart
  • Resource Asset Size - p$Percentile Dot single value
  • Largest Contentful Paint - p$Percentile Dot single value
  • Decoded Size - p$Percentile Dot single value
  • % Compression - p$Percentile Dot single value
  • % Cached - p$Percentile Dot single value
  • Duration - p$Percentile Dot single value
  • % Render Blocking - p$Percentile Dot single value

Resource Performance

  • Resource Timings - p$Percentile Dot categorical chart
  • Duration - p$Percentile Dot single value
  • Performance Grouped by Page/View Dot table

Mobile app start health

Investigate mobile app start performance by version and geography. View cold and warm start trends for the slowest versions and identify individual sessions with the longest startup times.

The Mobile app start health dashboard contains the following sections and tiles:

App start health

  • App startup performance across different geographical locations Dot choropleth map
  • Slowest versions (Top 10): Cold start trends Dot line chart
  • Slowest versions (Top 10): Warm start trends Dot line chart
  • Top 10 iOS sessions with the longest app starts Dot table
  • Top 10 Android sessions with the longest app starts Dot table
  • Average app start duration Dot single value
  • App start duration Dot line chart
  • App start counts Dot bar chart
  • App starts counts Dot single value

Mobile troubleshooting

Investigate crashes, ANRs, and errors across mobile frontends. Track top errors, crash and ANR trends, and request error counts by release version to diagnose regressions.

The Mobile troubleshooting dashboard contains the following sections and tiles:

Error and view diagnostics

  • Top 10 errors Dot table
  • Crashes trend Dot line chart
  • Crashes Dot single value
  • ANRs Dot single value
  • Request errors Dot single value
  • ANR trend Dot line chart
  • Request error trend Dot line chart

Release and version quality

  • Top 10 crashing versions Dot table
  • Top 10 crashing versions trend Dot line chart
  • Top versions by ANR Count Dot table
  • Top 10 versions by ANR count trend Dot line chart
  • Error geo distribution ($error_type) Dot choropleth map
  • Top versions by request errors count Dot table
  • Top 10 versions by request error count trend (last 7 days) Dot line chart
  • Errors Dot pie chart

Page performance & errors

Investigate web frontend navigation performance and JavaScript errors. Track page load time, Core Web Vitals (LCP, CLS, FID), and the top slowest navigations.

The Page performance & errors dashboard contains the following sections and tiles:

Page Performance

  • Page Load Time Dot line chart
  • Largest Contentful paint Dot line chart
  • Cumulative Layout Shift Dot line chart
  • First Input Delay Dot line chart

Errors

  • Top navigations (Top 20) Dot table
  • Errors by type Dot line chart
  • HTTP errors Dot bar chart
  • JS errors by browsers Dot bar chart
  • Processing time Dot line chart
  • Page size/weight Dot table
  • DNS time Dot line chart
  • Connection time for new connections Dot line chart

Page performance & errors

  • LCP - p75 Dot single value
  • CLS - p75 Dot single value
  • INP - p75 Dot single value
  • Page load time - median Dot single value
  • Error count Dot single value
  • Navigations Dot single value

XHR performance

Investigate XHR and fetch request performance trends. View request duration, time to first byte, and the most frequent, slowest, and most frequently failing XHR calls.

The XHR performance dashboard contains the following sections and tiles:

  • Request duration Dot line chart
  • Time to first byte Dot line chart
  • Most frequent XHRs (Top 20) Dot table
  • Slowest XHRs (Top 20) Dot table
  • Top Failed XHRs (Top 20) Dot table

XHR & fetch performance

  • Avg Request duration by country (Top 20) Dot categorical chart
  • Request duration - p90 Dot single value
  • XHR & fetch failure rate Dot single value
  • Time to first byte - median Dot single value

Extensions Extensions

Explore ready-made dashboards owned by Extensions Extensions.

Extension data consumption

View data consumption by extension and configuration. Identify the top 20 extensions and IP addresses by datapoints ingested to spot unexpected or excessive data producers.

The Extension data consumption dashboard contains the following sections and tiles:

  • Top 20 Dot categorical chart
  • Datapoints by extension Dot line chart
  • Datapoints by source Dot line chart
  • Top 20 IP addresses Dot categorical chart

Infrastructure & Operations Infrastructure & Operations

Explore ready-made dashboards owned by Infrastructure & Operations Infrastructure & Operations.

Infrastructure Observability Dashboard

Overview of host health, resource hotspots, and log activity across your environment. View host states, average CPU and memory usage, and the most problematic hosts.

The Infrastructure Observability Dashboard dashboard contains the following sections and tiles:

Impacted hosts

  • Hosts states Dot pie chart

  • Average resources usage Dot line chart

  • Top $TopLimit hosts by problems Dot honeycomb chart

    Click on honeycomb to see the name of the host tooltip and click in "Open with…" to view host details in the Infrastructure & Operations app. This chart is affected but the "TopLimit" variable.

  • Total hosts Dot single value

Resource hotspots

  • Logs accross all hosts Dot bar chart

Logs

  • Hosts availability Dot single value

    Average availability for all hosts. Only active hosts are counted in. Availability is reported every couple of minutes, thus timeframe should include at least 5 minutes period.

  • Top $TopLimit processes by highest CPU Dot line chart

    This chart is affected by the "TopLimit" variable.

Technologies and processes

  • Top $TopLimit hosts by lowest availability Dot honeycomb chart

    Click on honeycomb to see the name of the host tooltip and click in "Open with…" to view host details in the Infrastructure & Operations app. This chart is affected by the "TopLimit" variable.

  • Top $TopLimit hosts by highest CPU load Dot line chart

    This chart is affected by the "TopLimit" variable.

  • Top $TopLimit hosts by highest memory consumption Dot line chart

    This chart is affected by the "TopLimit" variable.

  • Top $TopLimit hosts by highest disk usage Dot line chart

    This chart is affected by the "TopLimit" variable.

  • Total traffic Dot single value

    Calculated as sum of inbound and outbound traffic for all hosts within a dashboard timeframe.

  • Top $TopLimit hosts by highest network traffic Dot line chart

    This chart is affected by the "TopLimit" variable.

  • Hosts cloud types Dot pie chart

  • Average CPU for Cloud types Dot table

  • Hosts Hypervisor type Dot pie chart

  • Average CPU for Hypervisor types Dot table

  • Hosts with problems Dot single value

    Counts hosts with problems that were reported at least within 6 hours of timeframe end date.

  • Hosts network traffic Dot line chart

  • Average CPU and memory usage across all processes Dot line chart

  • Hosts monitoring modes Dot pie chart

  • Hosts reaching resource saturation (CPU, memory or disk) Dot table

    Click on a specific id field in the table below and select "Open with…" to view host details in the Infrastructure & Operations app.

  • Events distribution by type Dot pie chart

  • 15 hosts with highest utilization (5 highest CPU, 5 highest memory, 5 highest disk usages) Dot table

    Click on a specific id field in the table below and select "Open with…" to view host details in the Infrastructure & Operations app.

Network analytics

Analyze AWS VPC network flow logs. View top source and destination address pairs, inter-VPC traffic, port distributions, and transit gateway flows.

The Network analytics dashboard contains the following sections and tiles:

AWS Network Flow Analytics

  • Inter VPC traffic Dot pie chart
  • Top 100 endpoint pairs Dot honeycomb chart

VPC - source/destination Ports

  • Top 5 origin VPC Dot line chart
  • Top 100 origin VPC Dot honeycomb chart
  • Top 5 endpoint pairs Dot line chart
  • Top 5 log group sources Dot line chart

VPC network flows matrix

  • Top 100 destination addresses and ports Dot table
  • Top 10 source port, address Dot bar chart
  • Inter region traffic Dot pie chart
  • Top 10 TGW traffic with largest packet loss Dot heatmap chart
  • Outbound HTTP(S) endpoints your workloads contacted (egress) Dot single value
  • Inbound clients hitting your HTTP(S) services (ingress) Dot single value
  • TGW traffic Dot donut chart
  • Top 10 destination ports Dot donut chart

Logs overview

  • Top 100 log group sources Dot honeycomb chart
  • Total log count Dot single value
  • NODATA and SKIPDATA log sources Dot line chart
  • Egress/Ingress log distribution Dot pie chart
  • Top 10 endpoint pairs Dot donut chart
  • Inter availability zone traffic Dot pie chart

Network devices

Analyze network device and interface performance. View interface health states, top interfaces by inbound and outbound load, and those with the highest discards and errors.

The Network devices dashboard contains the following sections and tiles:

Network devices performance

  • Interfaces in Up/Down state Dot table

    The list of network device interfaces in administratively up and operationally down state. Open in another application by going into cell action menu and selecting "Open with…" option.

  • Top $TopLimit interfaces by inbound load Dot line chart

    Load is calculated as current interface traffic per second divided by interface maximum speed.

  • Top $TopLimit interfaces by outbound load Dot line chart

    Load is calculated as current interface traffic per second divided by interface maximum speed.

  • Top $TopLimit interfaces by discards and errors Dot table

    The sorted list of the network interfaces with the top most inbound and outbound errors rates, inbound and outbound discards rates.

  • Top $TopLimit interfaces by inbound traffic Dot line chart

    Load is calculated as current interface traffic.

  • Top $TopLimit interfaces by outbound traffic Dot line chart

    Load is calculated as current interface traffic.

Network interfaces performance

  • Total network devices Dot single value

    The number of network devices monitored in the environment. Unaffected by dashboard variables, displays total value for this tenant.

  • Devices with problems Dot single value

    Can be filtered by "NetworkDevices" variable. If filtered, will display value only for selected devices.

  • Top $TopLimit devices by lowest reachability Dot honeycomb chart

    Reachability for configured devices. Can be filtered by "NetworkDevices" variable. More info on the topic: synthetic-monitoring To see details, click on chart sparkline, hover over "dt.entity.network:device", select "Open with…" option to open device in other applications.

  • Total problems Dot single value

    Can be filtered by "NetworkDevices" variable. If filtered, will display value only for selected devices.

  • Top $TopLimit devices by memory usage Dot line chart

    Can be filtered by "TopLimit", "NetworkDevices" variables. To see details, click on chart sparkline, hover over "dt.entity.network:device", select "Open with…" option to open device in other applications.

  • Top $TopLimit devices by CPU usage Dot line chart

    Can be filtered by "TopLimit", "NetworkDevices" variables. To see details, click on chart sparkline, hover over "dt.entity.network:device", select "Open with…" option to open device in other applications.

  • Top $TopLimit devices by network traffic Dot categorical chart

    Can be filtered by "TopLimit", "NetworkDevices" variables. Traffic is counted for last 3 minutes of a timeframe. To see details, click on chart sparkline, hover over "dt.entity.network:device", select "Open with…" option to open device in other applications.

  • Top $TopLimit devices by interfaces saturation Dot categorical chart

    If chart is empty, that means there's no saturated devices at the moment. Can be filtered by "TopLimit", "NetworkDevices" variables.

  • Saturated devices Dot single value

    Can be filtered by "TopLimit", "NetworkDevices" variables.

  • Saturated interfaces Dot single value

    Can be filtered by "TopLimit", "NetworkDevices" variables.

  • Total traffic Dot single value

    Displays total in/out traffic for all devices. Can be filtered by "NetworkDevices" variable. Value is calculated as average of inbound and outbound traffic for latest 10 minutes.

  • Errors outbound Dot single value

    Can be filtered by "NetworkDevices" variable. If filtered, will display metric value only for selected devices.

  • Discards outbound Dot single value

    Can be filtered by "NetworkDevices" variable. If filtered, will display metric value only for selected devices.

  • Discards inbound Dot single value

    Can be filtered by "NetworkDevices" variable. If filtered, will display metric value only for selected devices.

  • Top $TopLimit devices by Up/Down interfaces Dot categorical chart

    Can be filtered by "TopLimit", "NetworkDevices" variables.

  • Errors inbound Dot single value

    Can be filtered by "NetworkDevices" variable. If filtered, will display metric value only for selected devices.

Network performance

View an environment-level summary of network interface health. Track total traffic, inbound and outbound discards and errors, and nodes with interfaces in a down state.

The Network performance dashboard contains the following sections and tiles:

  • Node interfaces up/down Dot single value

    The number of network device interfaces is in an administratively up and operationally down state.

  • Inbound discards Dot single value

    The inbound discard rate for all interfaces of all devices in the environment.

  • Total traffic Dot single value

    The sum of the input and output traffic for all interfaces of all devices in the environment.

  • Outbound discards Dot single value

    The outbound discard rate for all interfaces of all devices in the environment.

  • Outbound errors Dot single value

    The outbound error rate for all interfaces of all devices in the environment.

  • Inbound errors Dot single value

    The inbound discard rate for all interfaces of all devices in the environment.

  • Monitored devices Dot single value

    The number of network devices monitored in the environment.

  • Open device problems Dot single value

    The number of open Problems affecting network devices.

Network performance

  • Interfaces in Up/Down state Dot table

    The list of network device interfaces in administratively up and operationally down state.

  • Inbound Dot line chart

    Load is calculated as current interface traffic per second divided by interface maximum speed.

  • Outbound Dot line chart

    Load is calculated as current interface traffic per second divided by interface maximum speed.

  • Top 50 interfaces by discards and errors Dot table

    The sorted list of the network interfaces with the top most inbound and outbound errors rates, inbound and outbound discards rates.

Kubernetes Kubernetes

Explore ready-made dashboards owned by Kubernetes Kubernetes.

Kubernetes cluster

View resource utilization and scale for a Kubernetes cluster. Track CPU, memory, and pod utilization alongside requests commitment to understand capacity headroom.

The Kubernetes cluster dashboard contains the following sections and tiles:

  • CPU utilization Dot single value
  • Memory utilization Dot single value
  • Pod utilization Dot single value
  • CPU requests commitment Dot single value
  • Memory requests commitment Dot single value
  • CPU limits commitment Dot single value
  • Memory limits commitment Dot single value
  • CPU usage per namespace Dot area chart
  • CPU quota Dot table

Memory

  • Memory usage per namespace Dot area chart
  • Memory quota Dot table
  • Receive bandwidth Dot area chart
  • Transmit bandwidth Dot area chart
  • Rate of received packets dropped Dot area chart
  • Rate of transmitted packets dropped Dot area chart
  • Rate of received errors Dot area chart
  • Rate of transmitted errors Dot area chart
  • Average pod bandwidth by namespace: received Dot area chart
  • Average pod bandwidth by namespace: transmitted Dot area chart

Network

  • Network usage Dot table

Kubernetes monitoring statistics

Troubleshoot Dynatrace Kubernetes platform monitoring and Prometheus integration. Identify failing queries, high-latency endpoints, and error patterns across the monitoring stack.

The Kubernetes monitoring statistics dashboard contains the following sections and tiles:

Access type

  • Top endpoints average queries per minute Dot categorical chart

    List the top number of average API requests per minute to endpoints of monitored Kubernetes clusters.

  • Failing queries per minute Dot line chart

    Shows the number of failed API requests per minute to endpoints of monitored Kubernetes clusters.

  • Successful queries per minute Dot line chart

    Shows the number of successful API requests per minute to endpoints of monitored Kubernetes clusters in the last 2 hours.

  • Average latency successful queries Dot line chart

    Shows the average latency of successful API requests to endpoints of monitored Kubernetes clusters.

  • Failing queries Dot table

    List the top number of failed API requests to endpoints of monitored Kubernetes clusters.

  • Availability of in-cluster ActiveGates Dot table

    List the availability of ActiveGate workloads in monitored Kubernetes clusters.

Kubernetes namespace - pods

Analyze resource allocation of all pods within a Kubernetes namespace. View pod counts, CPU and memory utilization, and namespace contribution to overall cluster capacity.

The Kubernetes namespace - pods dashboard contains the following sections and tiles:

  • Cluster CPU utilization contribution Dot single value
  • Cluster memory utilization contribution Dot single value
  • Pods Dot single value
  • CPU requests utilization Dot single value
  • Memory requests utilization Dot single value
  • CPU limits utilization Dot single value
  • Memory limits utilization Dot single value
  • CPU usage per pod Dot area chart
  • CPU quota Dot table

Memory

  • Memory usage per pod Dot area chart
  • Memory quota Dot table
  • Receive bandwidth Dot area chart
  • Transmit bandwidth Dot area chart
  • Rate of received packets dropped Dot area chart
  • Rate of transmitted packets dropped Dot area chart
  • Rate of received errors Dot area chart
  • Rate of transmitted errors Dot area chart

Network

  • Network usage Dot table

Kubernetes namespace - workloads

Track CPU and memory usage distribution across workloads in a Kubernetes namespace. View resource quotas, usage per workload, and overall namespace resource distribution.

The Kubernetes namespace - workloads dashboard contains the following sections and tiles:

  • CPU usage per workload Dot area chart
  • CPU quota Dot table

Memory

  • Memory usage per workload Dot area chart
  • Memory quota Dot table
  • Usage overview Dot table
  • CPU usage Dot pie chart
  • Memory usage Dot pie chart
  • Receive bandwidth Dot area chart
  • Transmit bandwidth Dot area chart
  • Rate of received packets dropped Dot area chart
  • Rate of transmitted packets dropped Dot area chart
  • Rate of received errors Dot area chart
  • Rate of transmitted errors Dot area chart

Network

  • Network usage Dot table

Kubernetes node - pods

Understand how pods consume resources on a specific Kubernetes node. View CPU, memory, and pod utilization alongside requests-based utilization percentages.

The Kubernetes node - pods dashboard contains the following sections and tiles:

  • CPU utilization Dot single value
  • Memory utilization Dot single value
  • Pods utilization Dot single value
  • CPU utilization (requests) Dot single value
  • Memory utilization (requests) Dot single value
  • CPU utilization (limits) Dot single value
  • Memory utilization (limits) Dot single value
  • CPU usage per pod Dot area chart
  • CPU quota Dot table

Memory

  • Memory usage per pod Dot area chart
  • Memory quota Dot table

Kubernetes persistent volumes

Inspect utilization and capacity of persistent volume claims in your cluster. Track volume usage trends, usage changes over time, and storage distribution across namespaces.

The Kubernetes persistent volumes dashboard contains the following sections and tiles:

  • Volume usage (%) Dot line chart
  • Volume usage change Dot line chart
  • Volume usage change top Dot categorical chart
  • Volumes Dot table
  • Usage by namespace Dot pie chart
  • Capacity by namespace Dot pie chart

Logs Logs app

Explore ready-made dashboards owned by Logs Logs app.

Log ingest overview

Monitor log ingest volume, pipeline health, and storage statistics. Identify top log producers, ingest errors, and non-persisted records to keep your logging pipeline healthy.

The Log ingest overview dashboard contains the following sections and tiles:

  • Log volume per bucket Dot pie chart
  • Grail storage (Bytes) Dot single value

📈 Log ingest health

  • Ingest persistance errors count Dot line chart
  • Non persisted records before ingest pipeline Dot line chart
  • Log API - errors Dot line chart
  • Logs API - rejected records Dot line chart
  • Extension Rejected records Dot line chart
  • Rejected records count Dot line chart
  • Classic Log Processing Pipeline status Dot line chart
  • Classic Log Processing Pipeline Execution Errors Dot line chart
  • Classic Log Processing Pipeline executions Dot line chart
  • Records filtered out in Classic Log Processing Pipeline Dot line chart
  • Logs API - records count Dot line chart

Log ingest API - statistics and health

  • Log retention time Dot table

Extensions

  • Log ingest volume (Server) Dot line chart

Top 20 log producers by entity

  • OneAgent vs Log API Dot line chart

Log query usage and costs

Monitor log query volume and associated costs as the environment admin. View daily, weekly, and monthly query counts alongside billable usage to track spending trends.

The Log query usage and costs dashboard contains the following sections and tiles:

  • Log query count Dot bar chart
  • Yesterday (costs) Dot single value

Query volume statistics, trends, and costs

  • Last 7d (total) Dot single value
  • Last 28d (billable) Dot single value
  • Yesterday (billable) Dot single value
  • Last 7d (costs) Dot single value
  • Last 28d (costs) Dot single value

Log query usage and costs

  • Yesterday (total) Dot single value
  • Last 7d (billable) Dot single value
  • Last 28d (total) Dot single value
  • Daily query volume (Last 30d) Dot line chart
  • Current volumes per logs bucket included usage data in selected timeframe Dot table
  • Median query duration Dot single value
  • Weekly active users Dot single value
  • Current volumes per logs bucket Dot pie chart
  • Queries across Grail log buckets Dot donut chart

Logs in Context

  • Weekly number of users by app Dot bar chart
  • Users Dot single value
  • Queries per user Dot single value
  • Users Dot single value
  • Weekly number of users by app Dot bar chart
  • Queries per user Dot single value
  • Users Dot single value
  • Assessment of optimization opportunity Dot single value
  • Weekly number of users by app Dot bar chart
  • Top 5 most used apps by query volume Dot donut chart

Microsoft Defender Cloud

Explore ready-made dashboards owned by Microsoft Defender Cloud.

Container Scan Events Coverage

Identify coverage gaps in container image scanning from Microsoft Defender. View scan coverage by product and see the latest 50 scan events across registries, repositories, and images.

The Container Scan Events Coverage dashboard contains the following sections and tiles:

Coverage report for container image scan events

  • Container image coverage by product Dot categorical chart
  • Registries Dot single value
  • Container repositories Dot single value
  • Container images Dot single value
  • Scanning products Dot single value

Coverage overview

  • Scan events over time by product Dot bar chart
  • Total scan events Dot single value
  • Repository coverage based on products and number of scans Dot table

Container Vulnerability Findings

Visualize Microsoft Defender container vulnerability findings by risk level. Break down critical and high findings by registry and repository to prioritize remediation.

The Container Vulnerability Findings dashboard contains the following sections and tiles:

Container vulnerability findings

  • Number of critical findings by registry Dot donut chart
  • Critical risk Dot single value
  • High risk Dot single value
  • Number of critical findings by repository Dot donut chart
  • Number of vulnerabilities by risk Dot donut chart
  • Affected registries Dot single value
  • Container repositories Dot single value
  • Container images Dot single value
  • Vulnerable components Dot single value

Vulnerabilities by risk

  • Medium risk Dot single value
  • Vulnerability findings over time by provider Dot bar chart

Top 10 affected registries by number of critical findings

  • Total ingested findings Dot single value

Runtime contextualization of container findings for alert reduction

Reduce container alert noise by correlating Microsoft Defender vulnerability findings with runtime context. View which findings affect running containers versus only repositories.

The Runtime contextualization of container findings for alert reduction dashboard contains the following sections and tiles:

Runtime contextualization of container findings for alert reduction

  • Critical risk Dot single value
  • High risk Dot single value
  • Number of vulnerabilities by risk Dot donut chart
  • Medium risk Dot single value
  • Percentage of vulnerabilities by funnel stage Dot categorical chart

Top 10 vulnerabilities

  • Critical risk Dot single value
  • High risk Dot single value
  • Medium risk Dot single value
  • Number of vulnerabilities by risk Dot donut chart
  • Critical risk Dot single value
  • Medium risk Dot single value
  • High risk Dot single value

Vulnerabilities in running containers

  • Number of vulnerabilities by risk Dot donut chart

Vulnerabilities in production containers

  • Container images in registries Dot single value
  • Container images in runtime Dot single value
  • Container images in production Dot single value

Security findings

Overview of security findings from Microsoft Defender by risk level. View affected objects and the latest 50 findings to focus remediation on the highest-risk issues.

The Security findings dashboard contains the following sections and tiles:

Security findings

  • Critical Dot single value
  • High Dot single value
  • Number of unique findings by risk Dot donut chart
  • Critical Dot single value

Findings by risk

  • Medium Dot single value
  • Findings over time by provider Dot bar chart
  • High Dot single value

Latest 50 security findings

  • Findings by type Dot categorical chart
  • Top 10 object types by risk Dot categorical chart
  • Top 10 products by risk Dot categorical chart
  • Medium Dot single value
  • Number of objects by risk Dot donut chart
  • Top 10 findings by risk and number of affected objects Dot table
  • Top 10 affected objects by number of findings Dot table

Affected runtime entities

  • Top 10 vulnerable host entities by finding criticality Dot table
  • Number of host entities by risk Dot donut chart
  • Top 10 vulnerable container workloads by finding criticality Dot table
  • Number of container workloads by risk Dot donut chart
  • Total ingested findings Dot single value
  • Number of cloud entities by risk Dot donut chart
  • Top 10 vulnerable cloud entities by finding criticality Dot table

Security product coverage

View security product coverage and scan event ingestion from Microsoft Defender for Cloud. Track reporting providers, event counts over time, and runtime coverage of hosts and container workloads.

The Security product coverage dashboard contains the following sections and tiles:

Coverage overview

  • Security events per top 10 products Dot categorical chart
  • Ingested finding events by provider over time Dot bar chart
  • Scan events Dot single value
  • Reporting providers Dot single value
  • Ingested scan events over time Dot bar chart
  • Finding events Dot single value
  • Security events by object coverage per product Dot table
  • Security events by findings number per object type Dot table

Runtime entity coverage: Hosts

  • Security events per top 10 object types Dot categorical chart

Runtime entity coverage: Container workloads

  • Container workload coverage Dot donut chart
  • Host coverage by product Dot table
  • Host coverage Dot donut chart
  • Container workload coverage by product Dot table
  • Last 10 covered hosts Dot table
  • Last 10 covered container workloads Dot table

Runtime entity coverage: Cloud entities

  • Last 10 covered cloud entities Dot table
  • Cloud entity coverage by product Dot table
  • Cloud entity coverage Dot donut chart

Vulnerability Findings

Visualize Microsoft Defender vulnerability findings by risk level. Identify top vulnerable components, affected objects, and the spread of critical and high findings by object type.

The Vulnerability Findings dashboard contains the following sections and tiles:

Vulnerability findings

  • Top 10 vulnerabilities by risk and number of affected objects Dot table
  • Critical Dot single value
  • High Dot single value
  • High & critical findings by object type Dot categorical chart
  • Number of vulnerabilities by risk Dot donut chart
  • Objects with top risk Dot single value

Affected objects and components by risk

  • Top 10 vulnerable components by finding criticality Dot table
  • Top 10 affected objects by finding criticality Dot table

Vulnerabilities by risk

  • Medium Dot single value
  • Vulnerability findings over time by provider Dot bar chart

Latest 50 security findings

  • Total ingested findings Dot single value
  • Number of components by risk Dot donut chart
  • Components with top risk Dot single value

Affected runtime entities

  • Top 10 vulnerable host entities by finding criticality Dot table
  • Top 10 vulnerable container workloads by finding criticality Dot table
  • Number of host entities by risk Dot donut chart
  • Number of container workloads by risk Dot donut chart
  • Number of affected objects by risk Dot donut chart
  • Top 10 high & critical findings by affected objects Dot table

Summary of critical and high-risk findings

  • Top 10 affected repositories by finding criticality Dot table

Top affected repositories by number of critical findings

  • Number of affected repositories by risk Dot donut chart
  • Repositories with top risk Dot single value

Microsoft Sentinel

Explore ready-made dashboards owned by Microsoft Sentinel.

Security findings

Overview of Microsoft Sentinel security findings by risk level. View affected objects and the latest 50 findings to focus remediation on the highest-risk issues.

The Security findings dashboard contains the following sections and tiles:

Security findings

  • Critical Dot single value
  • High Dot single value
  • Number of unique findings by risk Dot donut chart
  • Critical Dot single value

Findings by risk

  • Medium Dot single value
  • Findings over time by product Dot bar chart
  • High Dot single value

Latest 50 security findings

  • Findings by type Dot categorical chart
  • Top 10 object types by risk Dot categorical chart
  • Top 10 products by risk Dot categorical chart
  • Medium Dot single value
  • Number of objects by risk Dot donut chart
  • Top 10 findings by risk and number of affected objects Dot table
  • Top 10 affected objects by number of findings Dot table

Affected runtime entities

  • Top 10 vulnerable host entities by finding criticality Dot table
  • Number of host entities by risk Dot donut chart
  • Top 10 vulnerable container entities by finding criticality Dot table
  • Number of container entities by risk Dot donut chart
  • Total ingested findings Dot single value

Security product coverage

View security product coverage and scan event ingestion from Microsoft Sentinel. Track reporting products, scan event counts over time, and runtime coverage of hosts and container workloads.

The Security product coverage dashboard contains the following sections and tiles:

Coverage overview

  • Security events per top 10 products Dot categorical chart
  • Ingested finding events over time Dot bar chart
  • Scan events Dot single value
  • Reporting products Dot single value
  • Ingested scan events over time Dot bar chart
  • Finding events Dot single value
  • Security events by object coverage per product Dot table
  • Security events by findings number per object type Dot table

Runtime entity coverage: Hosts

  • Security events per top 10 object types Dot categorical chart

Runtime entity coverage: Container workloads

  • Monitored container workload coverage Dot donut chart
  • Host coverage by product Dot table
  • Monitored hosts scan coverage Dot donut chart
  • Container workload coverage by product Dot table
  • Covered monitored hosts Dot table
  • Covered monitored container workloads Dot table

OpenPipeline OpenPipeline

Explore ready-made dashboards owned by OpenPipeline OpenPipeline.

OpenPipeline usage overview

View data ingest volumes and pipeline activity for OpenPipeline. Compare OpenPipeline versus classic pipeline usage, track records by Grail bucket, and monitor configuration changes.

The OpenPipeline usage overview dashboard contains the following sections and tiles:

  • Incoming records Dot line chart

    Check number of incoming records by configuration (logs, spans, metrics,…). Identify unexpected increases or decreases in incoming data.

  • Ratio of records by Grail bucket Dot area chart

    See where records are stored within Grail.

  • Logs OpenPipeline vs. classic processing pipeline Dot area chart

Data Ingest via OpenPipeline

  • Config changes Dot table

  • Stored records in % Dot line chart

    Understand if a configuration change resulted in an unexpected increase or decrease in stored records.

Ingest via OpenPipeline vs. classic pipeline

  • Business events OpenPipeline vs. classic processing pipeline Dot area chart
  • Logs Dot single value
  • Metrics Dot single value
  • Spans Dot single value
  • Total events Dot single value

Ingest analysis per configuration: $Configuration

  • Ratio of records by ingest source Dot area chart

    See through which ingest sources records come into OpenPipeline.

  • Ratio of records by route name Dot area chart

    See where records are routed to. For logs and business events records that go to the classic pipeline go via the route "default".

  • Ratio of records by pipeline Dot area chart

    See through which pipelines records come into OpenPipeline.

  • Not stored records Dot line chart

    Check the number of discarded records by configuration (logs, spans, metrics, …). Records can be discarded by intentionally dropping them, by not persisting them in storage, or because the data is invalid.

Analysis of yesterday´s ingested data

  • Share of not stored records per pipeline Dot line chart

    See how many records are persisted in each pipeline during the routing phase. Note that records can also be dropped in the ingest source, which is not visible in this chart.

  • Share of not stored records by reason Dot line chart

    See the ratio of records not stored to records ingested, split by the reason for not storing. The reason can be not_persisted, intentionally_dropped, not_valid, internal_error, or buffer_overflow if the record is too large.

  • Ingested records per configuration Dot line chart

  • Total ingested records per configuration Dot area chart

  • Total events per type Dot honeycomb chart

OpenTelemetry

Explore ready-made dashboards owned by OpenTelemetry.

OpenTelemetry Collector - all Collectors

View the status and throughput of all connected OpenTelemetry Collectors. Track active collectors, request counts, spans, metrics, and network traffic across the collector fleet.

The OpenTelemetry Collector - all Collectors dashboard contains the following sections and tiles:

OpenTelemetry Collector status

  • Active Collectors (24h) Dot table

    This tile lists all OpenTelemetry Collector instances that have recently sent data to Dynatrace.

Memory and CPU time per collector instance

  • Request count totals Dot line chart

    This tile shows a timeseries of the HTTP request count of the OpenTelemetry Collectors. Note: Future versions of this dashboard will not include deprecated semantic conventions such as rpc.server.duration and rpc.client.duration. Please update your Collector to a version which uses rpc.server.call.duration and rpc.client.call.duration such as the Dynatrace distribution of the OpenTelemetry Collector v0.45.0 or later.

Telemetry data passing through collectors

  • Span totals Dot table

    This tile shows how many spans have been accepted/refused by the receivers, and how many have been sent/failed by the exporters of the OpenTelemetry Collectors.

  • Active Collectors (2m) Dot single value

    This tile shows the number of OpenTelemetry Collectors that have sent data to Dynatrace within the last two minutes, and are therefore considered active.

  • Total collectors (24h) Dot single value

    This tile shows the number of OpenTelemetry Collectors that have sent data to Dynatrace within the last 24 hours.

  • Span totals Dot line chart

    This tile shows a timeseries of all spans that have passed through the OpenTelemetry Collectors.

  • Metric datapoint totals Dot line chart

    This tile shows a timeseries of all metric datapoints that have passed through the OpenTelemetry Collectors.

  • Log totals Dot line chart

    This tile shows a timeseries of all logs that have passed through the OpenTelemetry Collectors.

  • Metric datapoint totals Dot table

    This tile shows how many metric datapoints have been accepted/refused by the receivers, and how many have been sent/failed by the exporters of the OpenTelemetry Collectors.

  • Log record totals Dot table

    This tile shows how many logs have been accepted/refused by the receivers, and how many have been sent/failed by the exporters of the OpenTelemetry Collectors.

  • Top 5 collectors by resident set size (last 10m) Dot table

    This tile shows the top 5 OpenTelemetry Collectors ordered by their resident set size.

  • Top 5 collectors by otelcol_process_cpu_seconds (last 10m) Dot table

    This tile shows the top 5 OpenTelemetry Collectors ordered by their CPU time.

  • Request size average Dot line chart

    This tile shows a timeseries of the average HTTP request size of the OpenTelemetry Collectors.

  • Request duration average Dot line chart

    This tile shows a timeseries of the average HTTP request duration of the OpenTelemetry Collectors. Note: Future versions of this dashboard will not include deprecated semantic conventions such as rpc.server.duration and rpc.client.duration. Please update your Collector to a version which uses rpc.server.call.duration and rpc.client.call.duration such as the Dynatrace distribution of the OpenTelemetry Collector v0.45.0 or later.

Network traffic

  • Requests by collector instance Dot table

    This tile shows the total incoming and outgoing requests to and from each collector instance. Note: Future versions of this dashboard will not include deprecated semantic conventions such as rpc.server.duration and rpc.client.duration. Please update your Collector to a version which uses rpc.server.call.duration and rpc.client.call.duration such as the Dynatrace distribution of the OpenTelemetry Collector v0.45.0 or later.

  • HTTP requests from the collector, by status code Dot table

    This tile lists the number of HTTP requests sent by the OpenTelemetry Collectors by their status code.

  • Total physical memory (resident set size) Dot line chart

    This tile shows a timeseries of the memory consumption of each OpenTelemetry Collector.

  • Total CPU user and system time in seconds Dot line chart

    This tile shows a timeseries of the CPU user and system time of each OpenTelemetry Collector.

OpenTelemetry Collector - single Collector

Drill into the performance of a single OpenTelemetry Collector. Monitor request counts, span and metric datapoint throughput, log totals, HTTP traffic, and queue size.

The OpenTelemetry Collector - single Collector dashboard contains the following sections and tiles:

Memory and CPU time

  • Request count Dot line chart

    This tile shows a timeseries of the incoming HTTP request count of the OpenTelemetry Collector.

Telemetry data passing through the collector

  • Span totals Dot table

    This tile shows how many spans have been accepted/refused by the receivers, and how many have been sent/failed by the exporters of the selected OpenTelemetry Collector instance.

  • Span totals Dot line chart

    This tile shows a timeseries of all spans that have passed through the selected OpenTelemetry Collector instance.

  • Metric datapoint totals Dot line chart

    This tile shows a timeseries of all metric datapoints that have passed through the selected OpenTelemetry Collector instance.

  • Log totals Dot line chart

    This tile shows a timeseries of all logs that have passed through the selected OpenTelemetry Collector instance.

  • Metric datapoint totals Dot table

    This tile shows how many metric datapoints have been accepted/refused by the receivers, and how many have been sent/failed by the exporters of the selected OpenTelemetry Collector instance.

  • Log record totals Dot table

    This tile shows how many logs have been accepted/refused by the receivers, and how many have been sent/failed by the exporters of the selected OpenTelemetry Collector instance.

  • Request size Dot line chart

    This tile shows a timeseries of the average incoming HTTP request size of the OpenTelemetry Collector.

  • Request duration Dot line chart

    This tile shows a timeseries of the average incoming HTTP request duration of the OpenTelemetry Collector.

HTTP incoming

  • Total physical memory (resident set size) Dot line chart

    This tile shows a timeseries of the memory consumption of the OpenTelemetry Collector.

  • Total CPU user and system time Dot line chart

    This tile shows a timeseries of the CPU user and system time of the OpenTelemetry Collector.

Queue size metrics

  • Exporter current queue size Dot line chart

    This tile shows a timeseries of the current exporter queue size of the OpenTelemetry Collector.

  • Exporter queue capacity Dot line chart

    This tile shows a timeseries of the exporter queue capacity of the OpenTelemetry Collector.

Batch metrics

  • Batch size (items) Dot line chart

    This tile shows a timeseries of the batch size (in items) of the OpenTelemetry Collector.

HTTP outgoing

  • Request count Dot line chart

    This tile shows a timeseries of the outgoing HTTP request count of the OpenTelemetry Collector.

  • Request size Dot line chart

    This tile shows a timeseries of the average outgoing HTTP request size of the OpenTelemetry Collector.

  • Request duration Dot line chart

    This tile shows a timeseries of the average outgoing HTTP request duration of the OpenTelemetry Collector.

RPC incoming

  • Request count Dot line chart

    This tile shows a timeseries of the incoming RPC request count of the OpenTelemetry Collector.

  • Request duration Dot line chart

    This tile shows a timeseries of the average incoming RPC request duration of the OpenTelemetry Collector.

RPC outgoing

  • Request count Dot line chart

    This tile shows a timeseries of the outgoing RPC request count of the OpenTelemetry Collector.

  • Request duration Dot line chart

    This tile shows a timeseries of the average outgoing RPC request duration of the OpenTelemetry Collector.

  • Batch size (bytes) Dot line chart

    This tile shows a timeseries of the batch size (in bytes) of the OpenTelemetry Collector.

  • Batch processor send trigger Dot line chart

    This tile shows a timeseries of the batch processor send triggers of the OpenTelemetry Collector.

OpenTelemetry K8s Cluster

Overview of Kubernetes cluster performance based on OpenTelemetry data. Track CPU, memory, pod utilization, and requests commitment across nodes, pods, and containers.

The OpenTelemetry K8s Cluster dashboard contains the following sections and tiles:

  • CPU Utilization Dot single value

    CPU utilization on cluster

  • Memory Utilization Dot single value

    Memory utilization on cluster

  • Pod Utilization Dot single value

    Pod utilization on cluster

  • CPU Requests Commitment Dot single value

    CPU requests commitment on cluster

  • Memory Requests Commitment Dot single value

    Memory requests commitment on cluster

  • CPU Limits Commitment Dot single value

    CPU limits commitment on cluster

  • Memory Limits Commitment Dot single value

    Memory limits commitment on cluster

  • CPU Usage per Namespace Dot area chart

    CPU usage per namespace on cluster

  • CPU Quota Dot table

    CPU quota per namespace on cluster

Memory

  • Memory Usage per Namespace Dot area chart

    Memory usage per namespace on cluster

  • Memory Quota Dot table

    Memory quota per namespace on cluster

  • Receive Bandwidth Dot area chart

    Network receive bandwidth per namespace on cluster

  • Transmit Bandwidth Dot area chart

    Network transmit bandwidth per namespace on cluster

  • Rate of Received Errors Dot area chart

    Rate of received errors per namespace on cluster

  • Rate of Transmitted Errors Dot area chart

    Rate of transmitted errors per namespace on cluster

  • Average Pod Bandwidth by Namespace: Received Dot area chart

    Average pod receive bandwidth per namespace on cluster

  • Average Pod Bandwidth by Namespace: Transmitted Dot area chart

    Average pod transmit bandwidth per namespace on cluster

Network

  • Network Usage Dot table

    Network usage per namespace on cluster

Cluster: $Cluster

  • Nodes Dot single value

    Number of nodes on cluster

  • Namespaces Dot single value

    Number of namespaces on cluster

  • Pods Dot single value

    Number of pods on cluster

  • Containers Dot single value

    Number of containers on cluster

  • Workloads Dot single value

    Number of workloads on cluster

  • Warning Events Dot single value

    Number of warning events on cluster

  • Node condition Dot categorical chart

    Number of active node conditions on cluster

  • Pod phase Dot categorical chart

    Number of pod phases on cluster

OpenTelemetry K8s Namespace - Pods

View pod resource allocation in a Kubernetes namespace using OpenTelemetry data. Track CPU and memory requests utilization and namespace contribution to cluster capacity.

The OpenTelemetry K8s Namespace - Pods dashboard contains the following sections and tiles:

  • Cluster CPU Utilization Contribution Dot single value

    Percentage of how much the CPU usage of this namespace contributes to the overall CPU usage

  • Cluster Memory Utilization Contribution Dot single value

    Percentage of how much the memory usage of this namespace contributes to the overall memory usage

  • Pods Dot single value

    Number of pods in the namespace

  • CPU Requests Utilization Dot single value

    Percentage of current CPU usage compared to the CPU resource requests in the namespace

  • Memory Requests Utilization Dot single value

    Percentage of current memory usage compared to the memory resource requests in the namespace

  • CPU Limits Utilization Dot single value

    Percentage of current CPU usage compared to the CPU resource limits in the namespace

  • Memory Limits Utilization Dot single value

    Percentage of current memory usage compared to the memory resource limits in the namespace

  • CPU Usage per Pod Dot area chart

    CPU usage of every pod in the namespace

  • CPU Quota Dot table

    CPU usage of every pod in the namespace, with CPU requests and limits and their usage

Memory

  • Memory Usage per Pod Dot area chart

    Memory usage of every pod in the namespace

  • Memory Quota Dot table

    Memory usage of every pod in the namespace, with memory requests and limits and their usage

  • Receive Bandwidth Dot area chart

    Received network bandwith per pod in the namespace

  • Transmit Bandwidth Dot area chart

    Transmitted network bandwith per pod in the namespace

  • Rate of Received Errors Dot area chart

    Received network errors per pod in the namespace

  • Rate of Transmitted Errors Dot area chart

    Transmitted network errors per pod in the namespace

Network

  • Network Usage Dot table

    Current network bandwidth and errors per pod

OpenTelemetry K8s Namespace - Workloads

Track workload CPU and memory usage in a Kubernetes namespace using OpenTelemetry data. View per-workload usage, resource quotas, and overall namespace resource distribution.

The OpenTelemetry K8s Namespace - Workloads dashboard contains the following sections and tiles:

  • CPU Usage per Workload Dot area chart

    CPU usage amounts per workload in the selected namespace.

  • CPU Quota Dot table

    CPU usage of every workload in the namespace, with CPU requests and limits and their usage

Memory

  • Memory Usage per Workload Dot area chart

    Memory usage amounts per workload in the selected namespace.

  • Memory Quota Dot table

    Memory usage of every workload in the namespace, with memory requests and limits and their usage

  • Usage Overview Dot table

    Overview of CPU and memory usage in the namespace, split by workload type.

  • CPU Usage Dot pie chart

    Percentage of CPU usage in the namespace per workload type.

  • Memory Usage Dot pie chart

    Percentage of memory usage in the namespace per workload type.

  • Receive Bandwidth Dot area chart

    Bytes received by each workload

  • Transmit Bandwidth Dot area chart

    Bytes transmitted by each workload

  • Rate of Received Errors Dot area chart

    Errors per second when receiving data in each workload.

  • Rate of Transmitted Errors Dot area chart

    Errors per second when transmitting data in each workload.

Network

  • Network Usage Dot table

    Overview of transmitted and received data in each workload.

OpenTelemetry K8s Node - Pods

View pod resource consumption on a specific Kubernetes node using OpenTelemetry data. Track CPU, memory, and pod utilization alongside requests-based utilization percentages.

The OpenTelemetry K8s Node - Pods dashboard contains the following sections and tiles:

  • CPU Utilization Dot single value

    Percentage of current CPU usage for the node compared to the allocatable amount of CPUs

  • Memory Utilization Dot single value

    Percentage of current memory usage for the node compared to the allocatable amount of memory

  • Pods Utilization Dot single value

    Percentage of current number of pods on the node compared to the allocatable number of pods

  • CPU Utilization (Requests) Dot single value

    Percentage of current CPU resource requests for the node compared to the allocatable amount of CPUs

  • Memory Utilization (Requests) Dot single value

    Percentage of current memory resource requests for the node compared to the allocatable amount of memory

  • CPU Utilization (Limits) Dot single value

    Percentage of current CPU resource limits for the node compared to the allocatable amount of CPUs

  • Memory Utilization (Limits) Dot single value

    Percentage of current memory resource limits for the node compared to the allocatable amount of memory

  • CPU Usage per Pod Dot area chart

    CPU usage of every pod on the node

  • CPU Quota Dot table

    CPU usage of every pod on the node, with CPU requests and limits and their usage

Memory

  • Memory Usage per Pod Dot area chart

    Memory usage of every pod on the node

  • Memory Quota Dot table

    Memory usage of every pod on the node, with memory requests and limits and their usage

OpenTelemetry K8s Persistent Volumes

Inspect persistent volume utilization in a Kubernetes cluster using OpenTelemetry data. Track volume usage trends, capacity by namespace, and changes over time.

The OpenTelemetry K8s Persistent Volumes dashboard contains the following sections and tiles:

  • Volume Usage (%) Dot line chart

    Volumes memory usage percentage of the capacity

  • Volume Usage Change Dot line chart

    Volumes memory usage change

  • Volumes Dot table

    Volumes memory usage, capacity and availability

  • Usage by Namespace Dot pie chart

    Volumes memory usage by namespace

  • Capacity by Namespace Dot pie chart

    Volumes memory capacity by namespace

xSPM Security Posture Management

Explore ready-made dashboards owned by xSPM Security Posture Management.

Security Posture overview

View compliance findings from the latest assessment across your environment. Track assessed systems, resource counts, compliance rules, and passing rates by compliance standard.

The Security Posture overview dashboard contains the following sections and tiles:

  • Systems Dot single value
  • Assessed configurations Dot single value

Top 50 compliance findings

  • System types Dot donut chart
  • Assessed resources Dot single value

Services Services app

Explore ready-made dashboards owned by Services Services app.

Endpoint Cardinality Dashboard

Find services with high numbers of distinct endpoint names, which can indicate volatile URL patterns or misconfigured endpoint detection. View the top 10 services by endpoint cardinality.

The Endpoint Cardinality Dashboard dashboard contains the following sections and tiles:

  • Maximum distinct endpoint names for one service Dot single value

About this dashboard

  • Top 10 maximum distinct endpoints per service Dot table

    Max distinct destination count for one Service (publish)

Messaging Destination Dashboard

Find services with high numbers of distinct messaging destinations, which can indicate volatile or temporary queue detection. View the top 10 services by destination cardinality.

The Messaging Destination Dashboard dashboard contains the following sections and tiles:

  • Top 10 maximum distinct destinations per service Dot table

  • Max distinct destinations for one service Dot single value

    Max distinct destination count for one Service (publish)

  • Max distinct destinations for one service Dot single value

    Max distinct destination count for one Service (publish)

  • Max distinct destinations for one service Dot single value

    Max distinct destination count for one Service (publish)

  • Top 10 maximum distinct destinations per service Dot table

  • Top 10 maximum distinct destinations per service Dot table

Synthetic Synthetic app

Explore ready-made dashboards owned by Synthetic Synthetic app.

Synthetic network availability monitoring

Monitor ICMP, TCP, and DNS synthetic checks. View availability, round-trip times, and the top monitors with lowest availability to detect network connectivity issues.

The Synthetic network availability monitoring dashboard contains the following sections and tiles:

  • ICMP monitor availability and performance by locations Dot table

  • ICMP monitors availability Dot single value

  • Avarage round-trip time trends by locations (7 days) Dot line chart

    Shows the performance trends of network targets across various synthetic locations over the past 7 days. Use this chart to identify periods of degraded performance or improvement, understand normal behavior for each location, and compare values with the availability and performance data from other sections

  • Top $TopLimit ICMP monitors with lowest availability Dot table

  • ICMP monitor availability & round-trip time trends Dot line chart

Synthetic Network Monitors Health & Performance

  • ICMP monitor executions Dot bar chart

  • TCP monitors availability Dot single value

  • TCP monitor availability & connection time trends Dot line chart

  • TCP monitor executions Dot bar chart

  • DNS monitor availability & resolution time trends Dot line chart

  • DNS monitors availability Dot single value

  • DNS monitor executions Dot bar chart

  • Top $TopLimit TCP request targets with the lowest availability Dot table

  • TCP monitors availability and performance by locations Dot table

  • Top $TopLimit TCP monitors with lowest availability Dot table

  • Average TCP connection time for top $TopLimit request targets (7 days) Dot line chart

    This chart tracks the TCP connection time for various monitored targets across multiple locations over the past 7 days. Connection time represents the total time taken to establish a TCP connection.

DNS monitors

  • Top $TopLimit DNS request targets with the lowset availability Dot table
  • DNS monitor availability and performance by location Dot table
  • Top $TopLimit DNS monitors with the lowest availability Dot table
  • Total ICMP monitors Dot single value
  • Total targets Dot single value
  • Total locations Dot single value
  • Top $TopLimit ICMP request targets with the lowest availability Dot table

ICMP request targets

  • Average round-trip time trends for top $TopLimit request targets (7 days) Dot line chart

    This chart visualizes the Round-Trip Time (RTT) trends for key monitored targets over the past 7 days. The RTT measures the time taken for an ICMP request to travel to the target and back. Lower RTT values reflect faster response times and a healthier network connection.

  • Total TCP monitors Dot single value

  • Total DNS monitors Dot single value

TCP request targets

  • Average DNS resolution time trends for top $TopLimit request targets (7 days) Dot line chart

    This chart tracks the DNS resolution time for different request targets (domains/hostnames) across various locations over the past 7 days. DNS resolution time measures how long it takes for a DNS server to convert a domain name into its corresponding IP address, which directly affects how quickly users can access services.

DNS request targets

  • Failure status code distribution Dot bar chart
  • Status code statistics Dot donut chart

DNS status codes overview

  • Failure status code distribution Dot bar chart
  • Status code statistics Dot donut chart

TCP status codes

  • Failure status code distribution Dot bar chart

  • Status code statistics Dot donut chart

  • ICMP monitors availability Dot honeycomb chart

  • TCP monitors availability Dot honeycomb chart

  • DNS monitors availability Dot honeycomb chart

  • Avarage round-trip time trends (7 days) for top $TopLimit monitors Dot line chart

  • Average TCP connection time trends for top $TopLimit monitors (7 days) Dot table

  • Average TCP connection time trends by locations (7 days) Dot line chart

    Shows the performance trends of network targets across various synthetic locations over the past 7 days. Use this chart to identify periods of degraded performance or improvement, understand normal behavior for each location, and compare values with the availability and performance data from other sections

  • Average resolution time trends for top $TopLimit monitors (7 days) Dot line chart

  • Average resolution time trends by locations (7 days) Dot line chart

    Shows the performance trends of network targets across various synthetic locations over the past 7 days. Use this chart to identify periods of degraded performance or improvement, understand normal behavior for each location, and compare values with the availability and performance data from other sections

  • Network monitor overview Dot honeycomb chart

  • Network monitors with problems Dot single value

  • Top $TopLimit most recent problems Dot table

Availability and performance

  • Network monitor problem types Dot pie chart

Synthetic web availability and performance

Monitor HTTP and browser synthetic checks by availability and response time. View the top monitors with lowest availability and request duration trends by location.

The Synthetic web availability and performance dashboard contains the following sections and tiles:

  • HTTP monitor availability and performance by locations Dot table

  • HTTP monitor availability Dot single value

  • Browser monitor availability Dot single value

  • Average HTTP request duration by locations (7days) Dot line chart

  • Top $TopLimit HTTP monitors with lowest availability Dot table

  • Browser monitor availability and performance by locations Dot table

    This section highlights the availability and average event duration of browser monitors from various synthetic locations across the globe. Availability represents the percentage of time that monitors from each location are operational, while event duration indicates the average response time, for browser activities.

  • Top $TopLimit browser monitors with lowest availability Dot table

    This table offers a breakdown of the availability and event duration of individual browser monitors, helping to quickly identify monitors that may need attention due to extended downtime or slower response times.

  • Average browser monitor event duration by locations (7days) Dot line chart

    This time-series graph displays performance trends of different locations over the last 7 days, specifically highlighting how event durations evolve over time. This chart allows you to detect any unusual spikes or drops in performance at various locations. Identifying these trends can assist in diagnosing intermittent issues or recent websites dis

  • Browser monitor duration & availability Dot line chart

  • HTTP monitor duration & availability trends Dot line chart

Browser monitors

  • HTTP executions Dot bar chart

    The number of monitor runs signifies the health and accuracy of the monitoring system, offering transparency into service stability.

  • Browser executions Dot bar chart

    The number of monitor runs signifies the health and accuracy of the monitoring system, offering transparency into service stability.

  • Top $TopLimit frontends with lowest availability Dot table

HTTP status code insights

  • Status code statistics Dot donut chart

    This table displays the total number of executions for each status code. It helps quantify how frequently certain HTTP responses occur.

Frontends

  • Average browser monitor event duration by frontends (7days) for top $TopLimit monitors Dot line chart
  • Browser monitor locations Dot single value
  • Frontends Dot single value
  • Services Dot single value

Synthetic Web Monitors Health & Performance

  • Unsuccessful HTTP status code distribution Dot bar chart

    This chart shows the percentage breakdown of non-200 status codes over time. Each color represents a specific status code, helping you visualize how often errors (like 401 Unauthorized or 403 Forbidden) or redirects (like 302 Found) occur in relation to successful requests.

  • Failure distribution caused by server interactions Dot bar chart

    This section analyzes the failures of browser monitors attributed to server interactions. An increase in these failures may indicate underlying issues with the application server or the IT infrastructure. Monitoring these trends is crucial for identifying potential bottlenecks and ensuring optimal performance.

Status codes overview

  • Failure distribution caused by page interactions Dot bar chart

    This section examines failures in browser monitors related to page interactions. An increase in these failures may indicate issues with the website’s functionality, suggesting the need to adjust monitoring scripts in response to UI changes or potential problems accessing specific elements on the page. Proactively addressing these issues can enhance

  • Top $TopLimit HTTP requests availability and performance Dot table

HTTP requests

  • Average browser monitor event duration trends (7 days) for top $TopLimit monitors Dot line chart

    This performance trends graph focuses on browser monitors over the past 7 days, showing how event durations for specific monitors change over time.

  • Average HTTP request duration trends (7 days) for top $TopLimit monitors Dot line chart

HTTP monitors across synthetic locations

  • Browser monitor overview Dot honeycomb chart
  • HTTP monitor overview Dot honeycomb chart

Availability and performance

  • Top $TopLimit most recent problems Dot table
  • Browser monitors with problems Dot single value
  • HTTP monitors availability Dot honeycomb chart
  • Browser monitors availability Dot honeycomb chart
  • Total HTTP monitors Dot single value
  • Total browser monitors Dot single value
  • HTTP monitors with problems Dot single value
  • HTTP monitor problem types Dot pie chart
  • Browser monitor problem types Dot pie chart
  • HTTP monitor locations Dot single value

Users & Sessions Users & Sessions

Explore ready-made dashboards owned by Users & Sessions Users & Sessions.

User sessions overview

Analyze user sessions across frontends. View session and user counts, browser distribution, geographic spread, and pages per session to understand your audience.

The User sessions overview dashboard contains the following sections and tiles:

  • User Count Dot bar chart
  • Session Count Dot bar chart
  • Browser name Dot pie chart
  • Region Dot choropleth map
  • Avg pages per session Dot bar chart
  • Avg interactions per session Dot bar chart
  • Avg unique pages visited per user Dot table
  • Number of sessions per ISP Dot table

Explore what the offers.

  • Web sessions with errors Dot line chart

Session Intensity

  • Sessions per user type - Real vs Synthetic vs Robot Dot pie chart
  • Sessions per frontend type Dot pie chart
  • Sessions with user interactions Dot bar chart

Session Segmentation

  • OS name Dot pie chart
  • Device Type Dot pie chart

Vulnerabilities Vulnerabilities

Explore ready-made dashboards owned by Vulnerabilities Vulnerabilities.

Vulnerability Coverage

View vulnerability scan coverage for hosts and processes. Track library vulnerability findings, scan counts over time, and identify the most affected hosts.

The Vulnerability Coverage dashboard contains the following sections and tiles:

  • Detected library vulnerabilities Dot bar chart
  • Total Host Coverage Dot donut chart
  • Process Coverage Dot donut chart
  • Performed scans for library vulnerabilities Dot line chart

Process coverage

  • Most affected hosts 🚨 Dot table
  • Most affected processes 🚨 Dot table

Host coverage

  • Not covered processes ⚠️ Dot table
  • Not covered hosts ⚠️ Dot table

Coverage and exposure

  • Total library vulnerability findings by severity Dot categorical chart

    Total number of library findings

Related tags
Dynatrace Platform