Fix probe timeouts due to OneAgent injection

This guide walks you through the process of fixing timeouts in readiness- or liveness-probes due to OneAgent injecting into the probe.

Scenario

In some scenarios, a readiness- or liveness-probe is configured using an exec statement. This configuration causes OneAgent to attempt injection when the probe executable starts. This injection process introduces a slight delay in startup time, which can result in the probe timing out.

Consider the following example of a readiness probe:

readinessProbe:
  exec:
    command:
    - /bin/sh
    - -ec
    - vault status -tls-skip-verify

In this example, Vault is the application we want to monitor, but we want to exclude the process used as the readiness-probe from being monitored.

Resolution

To resolve this issue, you can configure an exception in the settings.

Go to Settings > Processes and containers > Custom process monitoring rules.
Select Add custom rule.
Add an exclusion to monitoring by supplying a part of the command line arguments used by the readiness probe. To resolve the timeout in our example, use the following settings:
- Mode: Do not monitor
- Condition target: Command line args
- Condition operator: contains
- Condition value: vault status
Save your changes (this might take up to 5 minutes).

Once the settings are applied to the cluster, the timeouts should be resolved.