Azure Managed Instance for Apache Cassandra Monitoring
From both a data and infrastructure perspective, this Prometheus Extension 2.0 allows you to monitors and analyze the activity of your Apache Cassandra clusters. It visualize your cluster's health and shows metrics like CPU, connectivity, request latency, suspension, and garbage collection time. Additionally, with Davis, it automatically detects performance problems and provides precise root cause analysis.
Prerequisites
- Azure Managed Instance for Apache Cassandra created and running.
- An Ubuntu virtual machine deployed inside the Azure Virtual Network where the managed instance is present.
- Prometheus server set up to scrape Cassandra nodes and with relabel config in place.
- Environment ActiveGate version 1.231+ with access to the Prometheus server
Setup
- Create an Ubuntu virtual machine in the same virtual network as your Azure Managed Instance for Apache Cassandra.
- Ensure Docker is installed on your virtual machine.
- Create a file named
prometheus.yml
on your virtual machine with the contents below.
Add every Cassandra Node IP address and port9443
in thestatic_configs
section. The IP addresses can be gathered from the Data Center section of the Azure Portal for your Cassandra Cluster.static_configs:- targets: ["<Node_IP_1>:9443", "<Node_IP_2>:9443", "<Node_IP_N>:9443"]
global:scrape_interval: 15sscrape_timeout: 10sevaluation_interval: 15salerting:alertmanagers:- static_configs:- targets: []scheme: httptimeout: 10sscrape_configs:- job_name: prometheusscrape_interval: 15sscrape_timeout: 15smetrics_path: /metricsscheme: httpstatic_configs:- targets:- localhost:9090- job_name: "mcac"scrape_interval: 15sscrape_timeout: 15sstatic_configs:- targets: ["<Node_IP_1>:9443", "<Node_IP_2>:9443", "<Node_IP_N>:9443"]honor_labels: truehonor_timestamps: falsescheme: httpstls_config:insecure_skip_verify: truemetric_relabel_configs:#drop metrics we can calculate from prometheus directly- source_labels: [__name__]regex: .*rate_(mean|1m|5m|15m)action: drop#save the original name for all metrics- source_labels: [__name__]regex: (collectd_mcac_.+)target_label: prom_namereplacement: ${1}- source_labels: ["prom_name"]regex: .+_bucket_(\d+)target_label: lereplacement: ${1}- source_labels: ["prom_name"]regex: .+_bucket_inftarget_label: lereplacement: +Inf- source_labels: ["prom_name"]regex: .*_histogram_p(\d+)target_label: quantilereplacement: .${1}- source_labels: ["prom_name"]regex: .*_histogram_mintarget_label: quantilereplacement: "0"- source_labels: ["prom_name"]regex: .*_histogram_maxtarget_label: quantilereplacement: "1"#Table Metrics *ALL* we can drop- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.table\.(\w+)action: drop#Table Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.table\.(\w+)\.(\w+)\.(\w+)target_label: tablereplacement: ${3}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.table\.(\w+)\.(\w+)\.(\w+)target_label: keyspacereplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.table\.(\w+)\.(\w+)\.(\w+)target_label: __name__replacement: mcac_table_${1}#Keyspace Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.keyspace\.(\w+)\.(\w+)target_label: keyspacereplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.keyspace\.(\w+)\.(\w+)target_label: __name__replacement: mcac_keyspace_${1}#ThreadPool Metrics (one type is repair.task so we just ignore the second part)- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.thread_pools\.(\w+)\.(\w+)\.(\w+).*target_label: pool_typereplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.thread_pools\.(\w+)\.(\w+)\.(\w+).*target_label: pool_namereplacement: ${3}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.thread_pools\.(\w+)\.(\w+)\.(\w+).*target_label: __name__replacement: mcac_thread_pools_${1}#ClientRequest Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.client_request\.(\w+)\.(\w+)$target_label: request_typereplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.client_request\.(\w+)\.(\w+)$target_label: __name__replacement: mcac_client_request_${1}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.client_request\.(\w+)\.(\w+)\.(\w+)$target_label: clreplacement: ${3}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.client_request\.(\w+)\.(\w+)\.(\w+)$target_label: request_typereplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.client_request\.(\w+)\.(\w+)\.(\w+)$target_label: __name__replacement: mcac_client_request_${1}_cl#Cache Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.cache\.(\w+)\.(\w+)target_label: cache_namereplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.cache\.(\w+)\.(\w+)target_label: __name__replacement: mcac_cache_${1}#CQL Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.cql\.(\w+)target_label: __name__replacement: mcac_cql_${1}#Dropped Message Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.dropped_message\.(\w+)\.(\w+)target_label: message_typereplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.dropped_message\.(\w+)\.(\w+)target_label: __name__replacement: mcac_dropped_message_${1}#Streaming Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.streaming\.(\w+)\.(.+)$target_label: peer_ipreplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.streaming\.(\w+)\.(.+)$target_label: __name__replacement: mcac_streaming_${1}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.streaming\.(\w+)$target_label: __name__replacement: mcac_streaming_${1}#CommitLog Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.commit_log\.(\w+)target_label: __name__replacement: mcac_commit_log_${1}#Compaction Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.compaction\.(\w+)target_label: __name__replacement: mcac_compaction_${1}#Storage Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.storage\.(\w+)target_label: __name__replacement: mcac_storage_${1}#Batch Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.batch\.(\w+)target_label: __name__replacement: mcac_batch_${1}#Client Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.client\.(\w+)target_label: __name__replacement: mcac_client_${1}#BufferPool Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.buffer_pool\.(\w+)target_label: __name__replacement: mcac_buffer_pool_${1}#Index Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.index\.(\w+)target_label: __name__replacement: mcac_sstable_index_${1}#HintService Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.hinted_hand_off_manager\.([^\-]+)-(\w+)target_label: peer_ipreplacement: ${2}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.hinted_hand_off_manager\.([^\-]+)-(\w+)target_label: __name__replacement: mcac_hints_${1}#HintService Metrics- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.hints_service\.hints_delays\-(\w+)target_label: peer_ipreplacement: ${1}- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.hints_service\.hints_delays\-(\w+)target_label: __name__replacement: mcac_hints_hints_delays- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.hints_service\.([^\-]+)target_label: __name__replacement: mcac_hints_${1}# Misc- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.memtable_pool\.(\w+)target_label: __name__replacement: mcac_memtable_pool_${1}- source_labels: ["mcac"]regex: com\.datastax\.bdp\.type\.performance_objects\.name\.cql_slow_log\.metrics\.queries_latencytarget_label: __name__replacement: mcac_cql_slow_log_query_latency- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.read_coordination\.(.*)target_label: read_typereplacement: $1- source_labels: ["mcac"]regex: org\.apache\.cassandra\.metrics\.read_coordination\.(.*)target_label: __name__replacement: mcac_read_coordination_requests#GC Metrics- source_labels: ["mcac"]regex: jvm\.gc\.(\w+)\.(\w+)target_label: collector_typereplacement: ${1}- source_labels: ["mcac"]regex: jvm\.gc\.(\w+)\.(\w+)target_label: __name__replacement: mcac_jvm_gc_${2}#JVM Metrics- source_labels: ["mcac"]regex: jvm\.memory\.(\w+)\.(\w+)target_label: memory_typereplacement: ${1}- source_labels: ["mcac"]regex: jvm\.memory\.(\w+)\.(\w+)target_label: __name__replacement: mcac_jvm_memory_${2}- source_labels: ["mcac"]regex: jvm\.memory\.pools\.(\w+)\.(\w+)target_label: pool_namereplacement: ${2}- source_labels: ["mcac"]regex: jvm\.memory\.pools\.(\w+)\.(\w+)target_label: __name__replacement: mcac_jvm_memory_pool_${2}- source_labels: ["mcac"]regex: jvm\.fd\.usagetarget_label: __name__replacement: mcac_jvm_fd_usage- source_labels: ["mcac"]regex: jvm\.buffers\.(\w+)\.(\w+)target_label: buffer_typereplacement: ${1}- source_labels: ["mcac"]regex: jvm\.buffers\.(\w+)\.(\w+)target_label: __name__replacement: mcac_jvm_buffer_${2}#Append the prom types back to formatted names- source_labels: [__name__, "prom_name"]regex: (mcac_.*);.*(_micros_bucket|_bucket|_micros_count_total|_count_total|_total|_micros_sum|_sum|_stddev).*separator: ;target_label: __name__replacement: ${1}${2}- regex: prom_nameaction: labeldrop
-
Start your Prometheus server Docker container.
Be sure to change the path in the command below to point to the
prometheus.yml
file from above.docker run \-d \-p 9090:9090 \-v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \prom/prometheus -
If your virtual machine is not available from the internet, install a Dynatrace Environment ActiveGate on your Ubuntu VM.
Recommended: Set thegroup
property on the installation.
Enable and configure extension
-
In Dynatrace Hub, select Azure Managed Instance for Apache Cassandra.
-
Enable the extension.
-
Verify that the Prometheus endpoint publishes the Cassandra metrics. Use either of these queries:
{__name__=~"mcac.*"}
http://<Prometheus Server URL>:9090/api/v1/query?query=%7B__name__%3D%7E%22mcac.*%22%7D
-
Add the endpoint of your Prometheus server to the Extension Monitoring Configuration:
http://<Prometheus Server URL>:9090/api/v1
The
<Prometheus Server URL>
does not need to be public. If you install your ActiveGate on the same VM or same VNet as the Prometheus server,localhost
or a private IP can be used. -
Select the ActiveGate group on which to enable this extension.
-
Add a Monitoring Configuration description and select the Feature Sets of the metrics you'd like to collect.
-
A dashboard named Azure Managed Instance for Apache Cassandra Overview is provided with the extension.
Metrics
Available metrics are listed below.
- Metric metadata and dimensions are available using Data Explorer after the extension is enabled.
- See Apache Cassandra Monitoring Documentation for more information about collected metrics.