This extension documentation is now deprecated and will no longer be updated. We recommend using the new RabbitMQ extension for improved functionality and support.
RabbitMQ server monitoring provides a high-level overview of all RabbitMQ components within your cluster.
With RabbitMQ message-related metrics, you’ll immediately know when something is wrong. And when problems occur, it’s easy to see which nodes are affected. It’s then simple to drill down into the metrics of individual nodes to find the root cause of problems and potential bottlenecks.
With RabbitMQ monitoring enabled globally, Dynatrace automatically collects RabbitMQ metrics whenever a new host running RabbitMQ is detected in your environment.
15672
.*
replaces any combination of characters. For example, queue\*
will report any name starting with queue
.Dynatrace provides the option of enabling RabbitMQ monitoring for specific hosts rather than globally.
RabbitMQ cluster ("process group") overview pages provide an overview of RabbitMQ cluster health. From here, it’s easy to identify problematic nodes. Just select a relevant time interval for the timeline, select a node metric from the metric drop list, and compare the values of all nodes in the sortable table.
Further down the page, you’ll find a number of other cluster-specific charts.
Metric
Description
Queued messages
RabbitMQ’s queues are most efficient when they’re empty, so the lower the Queued messages count, the better.
Message rates
The Message rates chart is the best indicator of RabbitMQ performance.
Nodes health
Presents number of nodes in given state. Please be aware that this chart will be available not for every RabbitMQ version.
Queues health
The Queues health chart shows more than just queue health. RabbitMQ can handle a high volume of queues, but each queue requires additional resources, so watch these queue numbers carefully. If the queues begin to pile up, you may have a queue leak. If you can’t find the leakage, consider adding a queue-ttl policy.
Cluster summary
The Cluster summary chart provides an overview of all RabbitMQ cluster elements.
For more RabbitMQ performance tips, have a look at this article about avoiding high CPU and memory usage.
Metric
Description
Messages ready
The number of messages that are ready to be delivered. This is the sum of messages in the messages_ready status.
Messages unacknowledged
The number of messages delivered to clients, but not yet acknowledged. This is the sum of messages in the messages_unacknowledged status.
Acknowledged
The rate at which messages are acknowledged by the client/consumer.
Deliver and Get
The rate per second of the sum of messages: (1) delivered in acknowledgment mode to consumers, (2) delivered in n0-acknowledgment mode to consumers, (3) delivered in acknowledgment mode in response to basic.get, (4) delivered in n0-acknowledgment mode in response to basic.get.
Publish
The rate at which messages are incoming to the RabbitMQ cluster.
Failed
The number of unhealthy nodes. Please be aware that not every RabbitMQ version provides this metric.
Ok
The number of healthy nodes. Please be aware that not every RabbitMQ version provides this metric.
Queues health chart
The number of queues in a given state.
Channels
The number of channels (virtual connections). If the number of channels is high, you may have a memory leak in your client code.
Connections
The number of TCP connections to the message broker. Frequently opened and closed connections can result in high CPU usage. Connections should be long-lived. Channels can be opened and closed more frequently.
Consumers
The number of consumers
Exchanges
The number of exchanges
To access valuable RabbitMQ node metrics:
Valuable RabbitMQ node metrics are displayed on each RabbitMQ process page on the RabbitMQ metrics tab.
To return to the cluster level, expand the Properties section of the RabbitMQ Processes page and select the cluster.
More RabbitMQ monitoring metrics are available from individual Process pages. Select the Further details tab for more monitoring insights.
On the Further details tab you’ll find the following additional charts.
Chart
Description
Memory usage
The percentage of available RabbitMQ memory. 100% means that the RabbitMQ memory limit vm_memory_high_watermark has been reached. (by default, vm_memory_high_watermark is set to 40% of installed RAM). Once the RabbitMQ server has used up all available memory, all new connections are blocked. Note that this doesn’t prevent the RabbitMQ server from using more than its limit—this is merely the point at which publishers are throttled.
Available disk space
The percentage of available RabbitMQ disk space. Indicates how much available disk space remains before the disk_free_limit is reached. Once all available disk space is used up, RabbitMQ blocks producers and prevents memory-based messages from being paged to disk. This reduces, but doesn’t eliminate, the likelihood of a crash due to the exhaustion of disk space.
File descriptors usage
The percentage of available file descriptors. RabbitMQ installations running production workloads may require system limits and kernel-parameter tuning to handle a realistic number of concurrent connections and queues. RabbitMQ recommends allowing for at least 65,536 file descriptors when using RabbitMQ in production environments. 4,096 file descriptors is sufficient for most development workloads. RabbitMQ documentation suggests that you set your file descriptor limit to 1.5 times the maximum number of connections you expect.
Erlang processes usage
The percentage of available Erlang processes. The maximum number of processes can be changed via the RABBITMQ_SERVER_ERL_ARGS environment variable.
Sockets usage
The percentage of available Erlang sockets. The required number of sockets is correlated with the required number of file descriptors. For more details, see the Controlling System Limits on Linux section at www.rabbitmq.com.
For more information about RabbitMQ statistics, see www.rabbitmq.com.