Static thresholds for anomaly detection

A static threshold represents a hard limit that a metric should not violate. Because static thresholds don't change over time, they are an important monitoring tool for defining critical boundaries of normal operation.

It's important to choose between a static and an adaptive threshold, depending on your use case.

For example, you can use a static threshold to set a limit for total memory usage by a well-known process. In this case, a static threshold is superior to an adaptive threshold because if memory consumption slowly grows over time, the adaptive threshold simply changes with it, raising no problems and eventually leading to a hidden memory leak.

In the illustrations below, memory consumption steadily increases over 30 days. A statically defined threshold of 40 MB will catch the process's abnormal behavior, while an adaptive threshold will increase along with the metric value.

Static threshold

Adaptive threshold

Apart from the threshold value, you can specify how often the threshold must be violated within a sliding time window to raise an event (violations don't have to be successive). It helps you to avoid alerting too aggressively on single threshold violations. You can set a sliding window of up to 60 minutes.

By default, any 3 minutes out of a sliding window of 5 minutes must violate your threshold to raise an event. That is, an event would require 3 violating minutes within any 5-minute sliding window.