Cassandra monitoring
This extension documentation is now deprecated and will no longer be updated. We recommend using the new Apache Cassandra extension for improved functionality and support.
Apache Cassandra server monitoring in Dynatrace provides information about database exceptions, failed requests, performance, and more. If Cassandra is underperforming or a problem occurs, Dynatrace lets you know immediately and shows you which nodes are affected.
This is a JMX (Java Management Extension) Dynatrace extension. JMX is ideal for monitoring applications built using Java.
Prerequisites
- Cassandra 2.xx
- Linux or Windows
Enabling Cassandra monitoring globally
With Cassandra monitoring enabled globally, Dynatrace automatically collects Cassandra metrics whenever a new host running Cassandra is detected in your environment.
- Go to Settings.
- Select Monitoring > Monitored technologies.
- In the Supported technologies list, find the Cassandra JMX row.
- Turn on the Cassandra JMX switch.
Monitoring Cassandra in Dynatrace
- Go to Technologies & Processes or Technologies & Processes Classic (latest Dynatrace).
- Select the Apache Cassandra tile.
- To view Cassandra cluster metrics, select the cluster in the Process group table under the tiles.
The chart displays the selected process group (cluster) metric over time. You can select a different metric from the list. - In the expanded row, select Process group details to see details on the selected Cassandra cluster.
- On the Process group details page, select the Technology-specific metrics tab to identify any problematic nodes.
- To display node-specific metrics, select a node from the Process list under the chart.
- Select the Cassandra metrics tab to see valuable node-specific Cassandra metrics.
- The Exceptions and Failed requests charts show you if there’s a problem with the node. Pay particular attention to the Unavailable - Read, Unavailable - Write, and Unavailable - RangeSlice counts in Failed requests.
- The Operation count and Latency 95th percentile charts can help you monitor performance. Increased latency while the number of operations remains stable typically indicates a performance issue.
- Select the Further details tab to see charts on a variety of additional Cassandra metrics.
Cassandra cluster metrics
Select the Technology-specific metrics tab on the Process group details page to display aggregated Cassandra cluster metrics. Use the Show chart for list to change a different chart to display. All metrics are plotted against the number of process group instances. Hover your pointer over the chart to see an instance count and the minimum, maximum, and average for the selected metric at that time.
- Suspension
- JVM threads
- Java memory pool commits
- Java memory pool used
- GC time (garbage collection time)
- Exception count
- Files open
- RangeSlice latency
- RangeSlices
- Read latency
- Reads
- Storage load
- Write latency
- Writes
Cassandra node metrics
Cassandra metrics tab
The Cassandra metrics tab shows key metrics for Cassandra on the node level.
Chart | Metric | Description |
Exceptions | Exception count | Number of internal Cassandra exceptions detected. Under normal conditions, this metric should be zero. |
Failed requests | Unavailable – Read | Number of |
Unavailable – Write | Number of | |
Unavailable – RangeSlice | Number of | |
Timeout – Read | Number of | |
Timeout – Write | Number of | |
Timeout – RangeSlice | Number of | |
Failure – Read | Number of | |
Failure – Write | Number of | |
Failure – RangeSlice | Number of | |
Operation count | Read | Average number of reads per second. |
Write | Average number of writes per second. | |
RangeSlice | Average number of RangeSlices per second. | |
Latency 95th percentile | Read | Average 95th percentile of transaction read latency. |
Write | Average 95th percentile of transaction write latency. | |
RangeSlice | Average 95th percentile of transaction RangeSlice latency. |
Further details tab
The Further details tab shows additional metrics for Cassandra on the node level: Cache, Disk usage, Hints, Java managed memory, Load, and Pending tasks.
Chart | Metric | Description |
Cache: Hit rate | Row cache hit rate | 2m row cache hit rate. |
Key cache hit rate | 2m key cache row hit rate. | |
Disk usage: Storage load | Load | Size, in bytes, of the on-disk data the node manages. |
Disk usage: Bytes compacted | Bytes compacted | Total number of bytes compacted since server start. |
Disk usage: Compaction tasks pending | Pending tasks | Estimated number of compactions remaining to perform. |
Disk usage: Compaction tasks completed | Completed tasks | Number of completed compactions since server start. |
Disk usage: SSTable count | SSTable count | Number of SSTables on disk for this table. |
Hints | Hints | Number of hint messages written to this node since start. Includes one entry for each host to be hinted per hint. |
Java managed memory: poolname | Used memory | Java used memory. |
Committed memory | Java committed memory. | |
Maximum memory | Java maximum memory. | |
Garbage collection count | Java garbage collection count. | |
Garbage collection time | Java garbage collection time. | |
Load: Read latency | Average | Average 95th percentile of transaction read latency. |
Maximum | Maximum 95th percentile of transaction read latency. | |
Load: Write latency | Average | Average 95th percentile of transaction write latency. |
Maximum | Maximum 95th percentile of transaction write latency. | |
Load: RangeSlice latency | Average | Average 95th percentile of transaction RangeSlice latency. |
Maximum | Maximum 95th percentile of transaction RangeSlice latency. | |
Load: Read throughput | Average | Average number of reads per second. |
Maximum | Maximum number of reads per second. | |
Load: Write throughput | Average | Average number of writes per second. |
Maximum | Maximum number of writes per second. | |
Load: RangeSlice throughput | Average | Average number of RangeSlices per second. |
Maximum | Maximum number of RangeSlices per second. | |
Pending tasks: Read pending tasks | Read pending tasks | Number of read mutation tasks. |
Pending tasks: ReadRepair pending tasks | ReadRepair pending tasks | Number of ReadRepair mutation tasks. |
Pending tasks: Mutation pending tasks | Mutation pending tasks | Number of queued mutation tasks. |
Pending tasks: Compaction pending tasks | Compaction tasks pending | Estimated number of compactions remaining to perform. |