Apache Spark extension

  • Latest Dynatrace
  • Extension
  • Published Oct 27, 2025

Enhanced insights for Spark Components

Overview dashboardConfiguration feature setsMetrics Explorer - Spark tagProcess group Unified Analysis Screen updatedSpark Process Unified Analysis Screen
1 of 5Overview dashboard

Get started

Overview

Apache Spark collects JMX metrics to provide insights into spark performance.

JMX metrics provide insights into resource usage, job and application status, and performance of your spark components.

spark

Apache Spark metrics are presented alongside other infrastructure measurements, enabling in-depth cluster performance analysis of both current and historical data.

spark

Use cases

The extension enables insights into the overall health of Spark component instances.

Requirements

Dynatrace Activate this extension in your Dynatrace environment from the in-product Hub and simply select which OneAgents to enable this on.

Spark You must configure the required component metrics to be reported to the JMX sink.

  • The metrics system is configured via a configuration file that Spark expects to be present at $SPARK_HOME/conf/metrics.properties For example, you can enable metric collection to the JmxSink for the master, worker, driver and executor components with a command such as:
cat << EOF > $SPARK_HOME/conf/metrics.properties
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
EOF

Refer to the Spark documentation for more details

Compatibility information

  • Dynatrace OneAgent 1.275+
  • Linux or Windows OS
  • Spark version 3.x
  • Spark JMX metrics are enabled

Details

This extension can query and collect almost all component instance and namespace metrics as defined in Spark Metric Providers

  • Driver
  • Executor
  • applicationMaster (YARN)
  • mesos_cluster
  • master
  • ApplicationSource (Spark standalone)
  • worker
  • shuffleservice

Feature sets

When activating your extension using monitoring configuration, you can limit monitoring to one of the feature sets. To work properly the extension has to collect at least one metric after the activation.

In highly segmented networks, feature sets can reflect the segments of your environment. Then, when you create a monitoring configuration, you can select a feature set and a corresponding ActiveGate group that can connect to this particular segment.

All metrics that aren't categorized into any feature set are considered to be the default and are always reported.

A metric inherits the feature set of a subgroup, which in turn inherits the feature set of a group. Also, the feature set defined on the metric level overrides the feature set defined on the subgroup level, which in turn overrides the feature set defined on the group level.

Metric nameMetric keyDescription
master.workersspark.master.workers
master.aliveWorkersspark.master.aliveWorkers
master.appsspark.master.apps
master.waitingAppsspark.master.waitingApps
Metric nameMetric keyDescription
worker.executorsspark.worker.executors
spark.worker.coresUsedspark.worker.coresUsed
spark.worker.memUsed_MBspark.worker.memUsed_MB
spark.worker.coresFreespark.worker.coresFree
spark.worker.memFree_MBspark.worker.memFree_MB
Metric nameMetric keyDescription
mesos_cluster.waitingDriversspark.mesos_cluster.waitingDrivers
mesos_cluster.launchedDriversspark.mesos_cluster.launchedDrivers
mesos_cluster.retryDriversspark.mesos_cluster.retryDrivers
Metric nameMetric keyDescription
executor.bytesRead.countspark.executor.bytesRead.count
executor.bytesWritten.countspark.executor.bytesWritten.count
executor.cpuTime.countspark.executor.cpuTime.count
executor.filesystem.file.largeRead_opsspark.executor.filesystem.file.largeRead_ops
spark.executor.filesystem.file.read_bytesspark.executor.filesystem.file.read_bytes
spark.executor.filesystem.file.read_opsspark.executor.filesystem.file.read_ops
spark.executor.filesystem.file.write_bytesspark.executor.filesystem.file.write_bytes
executor.filesystem.file.write_opsspark.executor.filesystem.file.write_ops
executor.recordsRead.countspark.executor.recordsRead.count
executor.recordsWritten.countspark.executor.recordsWritten.count
executor.succeededTasks.countspark.executor.succeededTasks.count
Metric nameMetric keyDescription
applicationMaster.numContainersPendingAllocatespark.applicationMaster.numContainersPendingAllocate
spark.applicationMaster.numExecutorsFailedspark.applicationMaster.numExecutorsFailed
applicationMaster.numExecutorsRunningspark.applicationMaster.numExecutorsRunning
applicationMaster.numLocalityAwareTasksspark.applicationMaster.numLocalityAwareTasks
applicationMaster.numReleasedContainersspark.applicationMaster.numReleasedContainers
Metric nameMetric keyDescription
streaming.inputRate-totalspark.streaming.inputRate-total
streaming.latencyspark.streaming.latency
streaming.processingRate-totalspark.streaming.processingRate-total
streaming.states-rowsTotalspark.streaming.states-rowsTotal
streaming.states-usedBytesspark.streaming.states-usedBytes
Metric nameMetric keyDescription
shuffleService.numActiveConnections.countspark.shuffleService.numActiveConnections.count
shuffleService.numRegisteredConnections.countspark.shuffleService.numRegisteredConnections.count
shuffleService.numCaughtExceptions.countspark.shuffleService.numCaughtExceptions.count
shuffleService.registeredExecutorsSizespark.shuffleService.registeredExecutorsSize
Metric nameMetric keyDescription
appStatus.stages.failedStages.countspark.appStatus.stages.failedStages.count
appStatus.stages.skippedStages.countspark.appStatus.stages.skippedStages.count
appStatus.stages.completedStages.countspark.appStatus.stages.completedStages.count
appStatus.tasks.excludedExecutors.countspark.appStatus.tasks.excludedExecutors.count
appStatus.tasks.completedTasks.countspark.appStatus.tasks.completedTasks.count
appStatus.tasks.failedTasks.countspark.appStatus.tasks.failedTasks.count
appStatus.tasks.killedTasks.countspark.appStatus.tasks.killedTasks.count
appStatus.tasks.skippedTasks.countspark.appStatus.tasks.skippedTasks.count
spark.appStatus.tasks.unexcludedExecutors.countspark.appStatus.tasks.unexcludedExecutors.count
appStatus.jobs.succeededJobs.countspark.appStatus.jobs.succeededJobs.count
appStatus.jobs.failedJobs.countspark.appStatus.jobs.failedJobs.count
appStatus.jobs.jobDurationspark.appStatus.jobs.jobDuration
Metric nameMetric keyDescription
BlockManager.disk.diskSpaceUsed_MBspark.BlockManager.disk.diskSpaceUsed_MB
BlockManager.memory.maxMem_MBspark.BlockManager.memory.maxMem_MB
BlockManager.memory.maxOffHeapMem_MBspark.BlockManager.memory.maxOffHeapMem_MB
BlockManager.memory.maxOnHeapMem_MBspark.BlockManager.memory.maxOnHeapMem_MB
BlockManager.memory.memUsed_MBspark.BlockManager.memory.memUsed_MB
BlockManager.memory.offHeapMemUsed_MBspark.BlockManager.memory.offHeapMemUsed_MB
BlockManager.memory.onHeapMemUsed_MBspark.BlockManager.memory.onHeapMemUsed_MB
BlockManager.memory.remainingMem_MBspark.BlockManager.memory.remainingMem_MB
BlockManager.memory.remainingOffHeapMem_MBspark.BlockManager.memory.remainingOffHeapMem_MB
BlockManager.memory.remainingOnHeapMem_MBspark.BlockManager.memory.remainingOnHeapMem_MB
Metric nameMetric keyDescription
ApplicationSource.statusspark.ApplicationSource.status
ApplicationSource.runtime_msspark.ApplicationSource.runtime_ms
ApplicationSource.coresspark.ApplicationSource.cores
Metric nameMetric keyDescription
HiveExternalCatalog.fileCacheHits.countspark.HiveExternalCatalog.fileCacheHits.count
HiveExternalCatalog.filesDiscovered.countspark.HiveExternalCatalog.filesDiscovered.count
HiveExternalCatalog.hiveClientCalls.countspark.HiveExternalCatalog.hiveClientCalls.count
HiveExternalCatalog.parallelListingJobCount.countspark.HiveExternalCatalog.parallelListingJobCount.count
HiveExternalCatalog.partitionsFetched.countspark.HiveExternalCatalog.partitionsFetched.count
Metric nameMetric keyDescription
LiveListenerBus.numEventsPosted.countspark.LiveListenerBus.numEventsPosted.count
LiveListenerBus.queue.appStatus.numDroppedEvents.countspark.LiveListenerBus.queue.appStatus.numDroppedEvents.count
LiveListenerBus.queue.appStatus.sizespark.LiveListenerBus.queue.appStatus.size
LiveListenerBus.queue.eventLog.numDroppedEvents.countspark.LiveListenerBus.queue.eventLog.numDroppedEvents.count
LiveListenerBus.queue.eventLog.sizespark.LiveListenerBus.queue.eventLog.size
Metric nameMetric keyDescription
DAGScheduler.job.activeJobsspark.DAGScheduler.job.activeJobs
DAGScheduler.job.allJobsspark.DAGScheduler.job.allJobs
DAGScheduler.messageProcessingTime.countspark.DAGScheduler.messageProcessingTime.count
DAGScheduler.messageProcessingTime.oneminuteratespark.DAGScheduler.messageProcessingTime.oneminuterate
DAGScheduler.messageProcessingTime.meanspark.DAGScheduler.messageProcessingTime.mean
DAGScheduler.stage.failedStagesspark.DAGScheduler.stage.failedStages
spark.DAGScheduler.stage.runningStagesspark.DAGScheduler.stage.runningStages
DAGScheduler.stage.waitingStagesspark.DAGScheduler.stage.waitingStages
Related tags
AnalyticsJMXData Processing/AnalyticsApacheInfrastructure Observability