Troubleshooting
This page provides a comprehensive guide to help you diagnose and resolve common problems.
Initial troubleshooting steps
Before you begin with the specific troubleshooting sections, it's important to have a clear understanding of the current state of your Kubernetes cluster. The initial steps outlined below will help you gather essential information about your cluster's health and the status of its components.
- Check the status of your DynaKube by executing the
kubectl get dynakubes -n dynatrace
command. - Use the
troubleshoot
subcommand. - Check the status of the Dynatrace pods
Use thekubectl -n dynatrace get pods
command to check the status of the Dynatrace Operator, OneAgent or CSI-driver pods (the amount of pods will vary depending on the selected deployment mode). - Inspect the logs
Use thekubectl logs
command to inspect the logs of specific pods. For example,kubectl logs <pod-name>
will display the logs for a specific pod. - Describe the resource
Thekubectl describe
command can provide detailed information about a specific resource. For example,kubectl describe pod <pod-name>
will display detailed information about a specific pod.
General troubleshooting
General troubleshooting steps and guidance for common issues encountered when using Dynatrace with Kubernetes. It covers how to access debug logs, use the troubleshoot
subcommand, or generate a support archive.
Troubleshoot common Dynatrace Operator setup issues using the troubleshoot
subcommand
Dynatrace Operator version 0.9.0+
Run the command below to retrieve a basic output on DynaKube status, such as:
-
Namespace: If the
dynatrace
namespace exists (name can be overwritten via parameter) -
DynaKube:
- If
CustomResourceDefinition
exists - If
CustomResource
with the given name exists (name can be overwritten via parameter) - If the API URL ends with
/api
- If the secret name is the same as DynaKube (or
.spec.tokens
if used) - If the secret has Dynatrace Operator and Data Ingest tokens set
- If the secret for
customPullSecret
is defined
- If
-
Environment: If your environment is reachable from the Dynatrace Operator pod using the same parameters as the Dynatrace Operator binary (such as proxy and certificate).
-
OneAgent and ActiveGate image: If the registry is accessible; if the image is accessible from the Dynatrace Operator pod using the registry from the environment with (custom) pull secret.
kubectl exec deploy/dynatrace-operator -n dynatrace -- dynatrace-operator troubleshoot
If you use a different DynaKube name, add the --dynakube <your_dynakube_name>
argument to the command.
Example output if there are no errors for the above-mentioned fields:
{"level":"info","ts":"2022-09-12T08:45:21.437Z","logger":"dynatrace-operator-version","msg":"dynatrace-operator","version":"<operator version>","gitCommit":"<commithash>","buildDate":"<release date>","goVersion":"<go version>","platform":"<platform>"}[namespace ] --- checking if namespace 'dynatrace' exists ...[namespace ] √ using namespace 'dynatrace'[dynakube ] --- checking if 'dynatrace:dynakube' Dynakube is configured correctly[dynakube ] CRD for Dynakube exists[dynakube ] using 'dynatrace:dynakube' Dynakube[dynakube ] checking if api url is valid[dynakube ] api url is valid[dynakube ] checking if secret is valid[dynakube ] 'dynatrace:dynakube' secret exists[dynakube ] secret token 'apiToken' exists[dynakube ] customPullSecret not used[dynakube ] pull secret 'dynatrace:dynakube-pull-secret' exists[dynakube ] secret token '.dockerconfigjson' exists[dynakube ] proxy secret not used[dynakube ] √ 'dynatrace:dynakube' Dynakube is valid[dtcluster ] --- checking if tenant is accessible ...[dtcluster ] √ tenant is accessible
Debug logs
By default, OneAgent logs are located in /var/log/dynatrace/oneagent
.
To debug Dynatrace Operator issues, run
You might also want to check the logs from OneAgent pods deployed through Dynatrace Operator.
Generate a support archive using the support-archive
subcommand
Dynatrace Operator version 0.11.0+
Use support-archive
to generate a support archive containing all the files that can be potentially useful for the RFA analysis:
operator-version.txt
—a file containing the current Operator version informationlogs
—logs from all containers of the Dynatrace Operator pods in the Dynatrace Operator namespace (usuallydynatrace
); this also includes logs of previous containers, if available:dynatrace-operator
dynatrace-webhook
dynatrace-oneagent-csi-driver
manifests
—the Kubernetes manifests for Dynatrace Operator components and deployed DynaKubes in the Dynatrace Operator namespacetroubleshoot.txt
—output of a troubleshooting command that is automatically executed by thesupport-archive
subcommandsupportarchive_console.log
—complete output of thesupport-archive
subcommand
Usage
To create a support archive, execute the following command.
kubectl exec -n dynatrace deployment/dynatrace-operator -- dynatrace-operator support-archive
The collected files are now stored in a zip file and can be downloaded from the pod using the kubectl cp
command.
kubectl -n dynatrace cp <operator pod name>:/tmp/dynatrace-operator/operator-support-archive.zip ./tmp/dynatrace-operator/operator-support-archive.zip
The recommended approach is to use the --stdout
parameter line switch to stream the zip file directly to your disk.
kubectl exec -n dynatrace deployment/dynatrace-operator -- dynatrace-operator support-archive --stdout > operator-support-archive.zip
If you use the --stdout
parameter, all support archive command output is written to stderr
so as not to corrupt the support archive zip file.
Because the shell modifies the encoding and corrupts the zip archive, use the --stdout
parameter either in cmd.exe
or create the support archive on the pod and copy it with kubectl cp
(as described above).
Run support-archive
in a standalone pod
Dynatrace Operator version 1.0.0+
If the operator
pod is not functioning due to severe startup issues, you can run the support-archive
command in a standalone pod using the following command. Keep in mind that running this command in a standalone pod is recommended only as a last resort.
kubectl run -n dynatrace support-archive --rm -i --overrides='{ "spec": { "serviceAccount": "dynatrace-operator" } }' --restart Never --image <operator-image> -- support-archive --delay 10 --stdout > support-archive.zip
- Ensure that you use the same image as the
operator
pod. - The
--delay 10
parameter is important becausekubectl run
tends to miss the first few lines of output, which could lead to corruption of the support archive. - Specify the
serviceAccount
asdynatrace-operator
in the command as it allows the standalone pod to access all necessary logs and manifests required for compiling the support archive. Note that this method relies on the Dynatrace Operator resources still being installed and available on the cluster.
Sample output
The following is sample output from running support-archive
with the --stdout
parameter.
kubectl exec -n dynatrace deployment/dynatrace-operator -- dynatrace-operator support-archive --stdout > operator-support-archive.zip
[support-archive] dynatrace-operator {"version": "v0.11.0", "gitCommit": "...", "buildDate": "...", "goVersion": "...", "platform": "linux/amd64"}[support-archive] Storing operator version into operator-version.txt[support-archive] Starting log collection[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/server.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/provisioner.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/registrar.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/liveness-probe.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/server.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/provisioner.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/registrar.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/liveness-probe.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/server.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/provisioner.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/registrar.log[support-archive] Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/liveness-probe.log[support-archive] Successfully collected logs logs/dynatrace-operator-6d9fd9b9fc-sw5ll/dynatrace-operator.log[support-archive] Successfully collected logs logs/dynatrace-webhook-7d84599455-bfkmp/webhook.log[support-archive] Successfully collected logs logs/dynatrace-webhook-7d84599455-vhkrh/webhook.log[support-archive] Starting K8S object collection[support-archive] Collected manifest for manifests/injected_namespaces/Namespace-default.yaml[support-archive] Collected manifest for manifests/dynatrace/Namespace-dynatrace.yaml[support-archive] Collected manifest for manifests/dynatrace/Deployment-dynatrace-operator.yaml[support-archive] Collected manifest for manifests/dynatrace/Deployment-dynatrace-webhook.yaml[support-archive] Collected manifest for manifests/dynatrace/StatefulSet-dynakube-activegate.yaml[support-archive] Collected manifest for manifests/dynatrace/DaemonSet-dynakube-oneagent.yaml[support-archive] Collected manifest for manifests/dynatrace/DaemonSet-dynatrace-oneagent-csi-driver.yaml[support-archive] Collected manifest for manifests/dynatrace/DynaKube-dynakube.yaml
Debug configuration and monitoring issues using the Kubernetes Monitoring Statistics extension
The Kubernetes Monitoring Statistics extension can help you:
- Troubleshoot your Kubernetes Monitoring setup
- Troubleshoot your Prometheus integration setup
- Get detailed insights into queries from Dynatrace to the Kubernetes API
- Receive alerts when your Kubernetes Platform Monitoring setup experiences issues
- Get alerted on slow response times of your Kubernetes API
Potential issues when changing the monitoring mode
- Changing the monitoring mode from
classicFullStack
tocloudNativeFullStack
affects the host ID calculations for monitored hosts, leading to new IDs being assigned and no connection between old and new entities. - If you want to change the monitoring method from
applicationMonitoring
orcloudNativeFullstack
toclassicFullstack
orhostMonitoring
, you need to restart all the pods that were previously instrumented withapplicationMonitoring
orcloudNativeFullstack
.