Remotely monitor your Confluent Cloud Kafka Clusters and other resources.





Monitor your Confluent Cloud Kafka Clusters, Connectors, Schema Registries, and KSQL DB Applications. Every minute, ingest the data on your Confluent Resources performance via the Confluent-provided API.
Kafka Lag Partition Metrics and Kafka Lag Consumer Group Metrics feature sets are not provided by the Confluent API. To obtain these metrics, you need Kafka Lag Exporter. The exporter isn't supported by Dynatrace and needs to be set up and run independently from this extension. Currently, it's the only exprter supported by this extension.Find the extension in Dynatrace Hub and add it to your environment.
Create a new monitoring configuration. For more information, see Manage Prometheus extensions
In the Dynatrace monitoring configuration, the Confluent Cloud API Key and API Secret are used as the Basic Auth User (API Key) /Password (API secret) combination.
First you will need to create either a Cloud/Cluster API Key and Secret. This can be done via the Confluent UI or via their CLI. The MetricsViewer role is required to access the Confluent API. It is suggested to use the Organization scope for this role so it can be used as clusters are created or destroyed.
The endpoint for the extension is the a URL with your resource types and IDs at the end, similar to what is shown below. This URL supports multiple resources but we recommend to have between 5 and 10 per URL.
https://api.telemetry.confluent.cloud/v2/metrics/cloud/export?resource.kafka.id=lkc-XXXXX&resource.connector.id=lcc-XXXX1&resource.connector.id=lcc-XXXX2
Base URL
https://api.telemetry.confluent.cloud/v2/metrics/cloud/export?
Confluent Kafka Cluster
resource.kafka.id=lkc-XXXXXConfluent Kafka Schema Registry
resource.schema_registry.id=lsrc=XXXXXConfluent Kafka Connector
resource.connector.id=lcc-XXXXXConfluent Kafka KSQL DB Application
resource.ksql.id=lksqlc-XXXXXConfluent Kafka Compute Pool
resource.compute_pool.id=lfcp-XXXXXThis extension uses the Confluent Metric Export API to gather metrics.
This API has a fixed 5 minute offset for which the extension currently does not account for. This leads to metrics being out of sync by 5 minutes between Dynatrace and Confluent. For more information see the Timestamp offset in the Confluent Metric Export API documentation.
When activating your extension using monitoring configuration, you can limit monitoring to one of the feature sets. To work properly the extension has to collect at least one metric after the activation.
In highly segmented networks, feature sets can reflect the segments of your environment. Then, when you create a monitoring configuration, you can select a feature set and a corresponding ActiveGate group that can connect to this particular segment.
All metrics that aren't categorized into any feature set are considered to be the default and are always reported.
A metric inherits the feature set of a subgroup, which in turn inherits the feature set of a group. Also, the feature set defined on the metric level overrides the feature set defined on the subgroup level, which in turn overrides the feature set defined on the group level.
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Connect Sent Records | confluent_kafka_connect_sent_records | The delta count of total number of records sent from the transformations and written to Kafka for the source connector. Each sample is the number of records sent since the previous data point. |
| Kafka Connect Received Records | confluent_kafka_connect_received_records | The delta count of total number of records received by the sink connector. Each sample is the number of records received since the previous data point. |
| Kafka Connect Sent Bytes | confluent_kafka_connect_sent_bytes | The delta count of total bytes sent from the transformations and written to Kafka for the source connector. Each sample is the number of bytes sent since the previous data point. |
| Kafka Connect Received Bytes | confluent_kafka_connect_received_bytes | The delta count of total bytes received by the sink connector. Each sample is the number of bytes received since the previous data point. |
| Kafka Connect Dead Letter Queue Records | confluent_kafka_connect_dead_letter_queue_records | The delta count of dead letter queue records written to Kafka for the sink connector. |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Consumer Group Group Topic Sum Lag | kafka_consumergroup_group_topic_sum_lag | Sum of group offset lag across topic partitions |
| Kafka Consumer Group Poll Time (ms) | kafka_consumergroup_poll_time_ms | Group poll time |
| Kafka Consumer Group Group Offset | kafka_consumergroup_group_offset | Last group consumed offset of a partition |
| Kafka Consumer Group Group Sum Lag | kafka_consumergroup_group_sum_lag | Sum of group offset lag |
| Kafka Consumer Group Group Lag | kafka_consumergroup_group_lag | Group offset lag of a partition |
| Kafka Consumer Group Group Lag Seconds | kafka_consumergroup_group_lag_seconds | Group time lag of a partition |
| Kafka Consumer Group Group Max Lag | kafka_consumergroup_group_max_lag | Max group offset lag |
| Kafka Consumer Group Group Max Lag Seconds | kafka_consumergroup_group_max_lag_seconds | Max group time lag |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Server Cluster Link Destination Response Bytes | confluent_kafka_server_cluster_link_destination_response_bytes | The delta count of cluster linking response bytes from all request types. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds. |
| Kafka Server Cluster Link Source Response Bytes | confluent_kafka_server_cluster_link_source_response_bytes | The delta count of cluster linking source response bytes from all request types. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds. |
| Kafka Server Cluster Link Count | confluent_kafka_server_cluster_link_count.gauge | The current count of cluster links. The count is sampled every 60 seconds. The implicit time aggregation for this metric is MAX. |
| Kafka Server Cluster Link Mirror Topic Count | confluent_kafka_server_cluster_link_mirror_topic_count.gauge | The cluster linking mirror topic count for a link. The count is sampled every 60 seconds. |
| Kafka Server Cluster Link Mirror Topic Offset Lag | confluent_kafka_server_cluster_link_mirror_topic_offset_lag | The cluster linking mirror topic offset lag maximum across all partitions. The lag is sampled every 60 seconds. |
| Kafka Server Cluster Link Mirror Topic Bytes | confluent_kafka_server_cluster_link_mirror_topic_bytes | The delta count of cluster linking mirror topic bytes. The count is sampled every 60 seconds. |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Cluster Request Bytes | confluent_kafka_server_request_bytes | The delta count of total request bytes from the specified request types sent over the network. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds. |
| Kafka Cluster Response Bytes | confluent_kafka_server_response_bytes | The delta count of total response bytes from the specified response types sent over the network. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds. |
| Kafka Cluster Active Connection Count | confluent_kafka_server_active_connection_count.gauge | The count of active authenticated connections. |
| Kafka Cluster Request Count | confluent_kafka_server_request_count.gauge | The number of requests received over the network. |
| Kafka Cluster Successful Authentication Count | confluent_kafka_server_successful_authentication_count.gauge | The number of successful authentications. |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Cluster Received Bytes | confluent_kafka_server_received_bytes | The number of bytes of the customer's data received from the network. |
| Kafka Cluster Sent Bytes | confluent_kafka_server_sent_bytes | The number of bytes of the customer's data sent over the network. |
| Kafka Cluster Received Records | confluent_kafka_server_received_records | The number of records received. |
| Kafka Cluster Sent Records | confluent_kafka_server_sent_records | The number of records sent. |
| Kafka Cluster Retained Bytes | confluent_kafka_server_retained_bytes | The current number of bytes retained by the cluster. |
| Kafka Cluster Partition Count | confluent_kafka_server_partition_count.gauge | The number of partitions. |
| Kafka Cluster Load Raw | confluent_kafka_server_cluster_load_percent | A measure of the utilization of the cluster. The value is between 0.0 and 1.0. |
| Metric name | Metric key | Description |
|---|---|---|
| Confluent Flink Num Records In | confluent_flink_num_records_in | Total number of records this statement has received. |
| Confluent Flink Num Records Out | confluent_flink_num_records_out | Total number of records this task statement emitted. |
| Confluent Flink Pending Records | confluent_flink_pending_records | Total amount of available records after the consumer offset in a Kafka partition across all operators |
| Confluent Flink Current Input Watermark Milliseconds | confluent_flink_current_input_watermark_milliseconds | The last watermark this statement has received (in milliseconds) for the given table. |
| Confluent Flink Current Output Watermark Milliseconds | confluent_flink_current_output_watermark_milliseconds | The last watermark this statement has produced (in milliseconds) to the given table. |
| Metric name | Metric key | Description |
|---|---|---|
| Confluent Flink Compute Pool Utilization Current CFUs | confluent_flink_compute_pool_utilization_current_cfus | The absolute number of CFUs at a given moment |
| Confluent Flink Compute Pool Utilization CFU Minutes Consumed | confluent_flink_compute_pool_utilization_cfu_minutes_consumed | The number of how many CFUs consumed since the last measurement |
| Confluent Flink Compute Pool Utilization CFU Limit | confluent_flink_compute_pool_utilization_cfu_limit | The possible max number of CFUs for the pool |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Server Consumer Lag Offsets | confluent_kafka_server_consumer_lag_offsets | The lag between a group member's committed offset and the partition's high watermark |
| Metric name | Metric key | Description |
|---|---|---|
| Confluent Flink Statement Status | confluent_flink_statement_status | This metric monitors the status of a statement within the system. Its value is always set to 1, signifying the statement's presence. The statement's current operational state is identified through the metric.status tag. |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Partition Earliest Offset | kafka_partition_earliest_offset | Earliest offset of a partition |
| Kafka Partition Latest Offset | kafka_partition_latest_offset | Latest offset of a partition |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Ksql Streaming Unit Count | confluent_kafka_ksql_streaming_unit_count.gauge | The count of Confluent Streaming Units (CSUs) for this KSQL instance. The implicit time aggregation for this metric is MAX. |
| Kafka Ksql Query Saturation | confluent_kafka_ksql_query_saturation | The maximum saturation for a given ksqlDB query across all nodes. Returns a value between 0 and 1, a value close to 1 indicates that ksqlDB query processing is bottlenecked on available resources. |
| Kafka Ksql Task Stored Bytes | confluent_kafka_ksql_task_stored_bytes | The size of a given task's state stores in bytes. |
| Kafka Ksql Storage Utilization | confluent_kafka_ksql_storage_utilization | The total storage utilization for a given ksqlDB application. |
| Metric name | Metric key | Description |
|---|---|---|
| Kafka Schema Registry Schema Count | confluent_kafka_schema_registry_schema_count.gauge | The number of registered schemas. |
| Kafka Schema Registry Request Count | confluent_kafka_schema_registry_request_count.gauge | The delta count of requests received by the schema registry server. Each sample is the number of requests received since the previous data point. The count sampled every 60 seconds. |