Try it free

Apache Spark extension

  • Latest Dynatrace
  • Extension
  • Published Oct 27, 2025

Monitor Apache Spark cluster performance with JMX metrics for jobs, applications, executors, and resource usage on standalone or YARN clusters.

Get started

Overview

Apache Spark collects JMX metrics to provide insights into spark performance.

JMX metrics provide insights into resource usage, job and application status, and performance of your spark components.

spark
spark

Apache Spark metrics are presented alongside other infrastructure measurements, enabling in-depth cluster performance analysis of both current and historical data.

spark
spark

Use cases

The extension enables insights into the overall health of Spark component instances.

Requirements

Dynatrace Activate this extension in your Dynatrace environment from the in-product Hub and simply select which OneAgents to enable this on.

Spark You must configure the required component metrics to be reported to the JMX sink.

  • The metrics system is configured via a configuration file that Spark expects to be present at $SPARK_HOME/conf/metrics.properties For example, you can enable metric collection to the JmxSink for the master, worker, driver and executor components with a command such as:
cat << EOF > $SPARK_HOME/conf/metrics.properties
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
EOF

Refer to the Spark documentation for more details

Compatibility information

  • Dynatrace OneAgent 1.275+
  • Linux or Windows OS
  • Spark version 3.x
  • Spark JMX metrics are enabled

Details

This extension can query and collect almost all component instance and namespace metrics as defined in Spark Metric Providers

  • Driver
  • Executor
  • applicationMaster (YARN)
  • mesos_cluster
  • master
  • ApplicationSource (Spark standalone)
  • worker
  • shuffleservice

Feature sets

When activating your extension using monitoring configuration, you can limit monitoring to one of the feature sets. To work properly, the extension has to collect at least one metric after the activation.

In highly segmented networks, feature sets can reflect the segments of your environment. Then, when you create a monitoring configuration, you can select a feature set and a corresponding ActiveGate group that can connect to this particular segment.

All metrics that aren't categorized into any feature set are considered to be the default and are always reported.

A metric inherits the feature set of a subgroup, which in turn inherits the feature set of a group. Also, the feature set defined on the metric level overrides the feature set defined on the subgroup level, which in turn overrides the feature set defined on the group level.

ApplicationSource
Metric nameMetric keyDescription
ApplicationSource.statusspark.ApplicationSource.status—
ApplicationSource.runtime_msspark.ApplicationSource.runtime_ms—
ApplicationSource.coresspark.ApplicationSource.cores—
HiveExternalCatalog
Metric nameMetric keyDescription
HiveExternalCatalog.fileCacheHits.countspark.HiveExternalCatalog.fileCacheHits.count—
HiveExternalCatalog.filesDiscovered.countspark.HiveExternalCatalog.filesDiscovered.count—
HiveExternalCatalog.hiveClientCalls.countspark.HiveExternalCatalog.hiveClientCalls.count—
HiveExternalCatalog.parallelListingJobCount.countspark.HiveExternalCatalog.parallelListingJobCount.count—
HiveExternalCatalog.partitionsFetched.countspark.HiveExternalCatalog.partitionsFetched.count—
LiveListenerBus
Metric nameMetric keyDescription
LiveListenerBus.numEventsPosted.countspark.LiveListenerBus.numEventsPosted.count—
LiveListenerBus.queue.appStatus.numDroppedEvents.countspark.LiveListenerBus.queue.appStatus.numDroppedEvents.count—
LiveListenerBus.queue.appStatus.sizespark.LiveListenerBus.queue.appStatus.size—
LiveListenerBus.queue.eventLog.numDroppedEvents.countspark.LiveListenerBus.queue.eventLog.numDroppedEvents.count—
LiveListenerBus.queue.eventLog.sizespark.LiveListenerBus.queue.eventLog.size—
DAGScheduler
Metric nameMetric keyDescription
DAGScheduler.job.activeJobsspark.DAGScheduler.job.activeJobs—
DAGScheduler.job.allJobsspark.DAGScheduler.job.allJobs—
DAGScheduler.messageProcessingTime.countspark.DAGScheduler.messageProcessingTime.count—
DAGScheduler.messageProcessingTime.oneminuteratespark.DAGScheduler.messageProcessingTime.oneminuterate—
DAGScheduler.messageProcessingTime.meanspark.DAGScheduler.messageProcessingTime.mean—
DAGScheduler.stage.failedStagesspark.DAGScheduler.stage.failedStages—
spark.DAGScheduler.stage.runningStagesspark.DAGScheduler.stage.runningStages—
DAGScheduler.stage.waitingStagesspark.DAGScheduler.stage.waitingStages—
master
Metric nameMetric keyDescription
master.workersspark.master.workers—
master.aliveWorkersspark.master.aliveWorkers—
master.appsspark.master.apps—
master.waitingAppsspark.master.waitingApps—
worker
Metric nameMetric keyDescription
worker.executorsspark.worker.executors—
spark.worker.coresUsedspark.worker.coresUsed—
spark.worker.memUsed_MBspark.worker.memUsed_MB—
spark.worker.coresFreespark.worker.coresFree—
spark.worker.memFree_MBspark.worker.memFree_MB—
mesosCluster
Metric nameMetric keyDescription
mesos_cluster.waitingDriversspark.mesos_cluster.waitingDrivers—
mesos_cluster.launchedDriversspark.mesos_cluster.launchedDrivers—
mesos_cluster.retryDriversspark.mesos_cluster.retryDrivers—
executor
Metric nameMetric keyDescription
executor.bytesRead.countspark.executor.bytesRead.count—
executor.bytesWritten.countspark.executor.bytesWritten.count—
executor.cpuTime.countspark.executor.cpuTime.count—
executor.filesystem.file.largeRead_opsspark.executor.filesystem.file.largeRead_ops—
spark.executor.filesystem.file.read_bytesspark.executor.filesystem.file.read_bytes—
spark.executor.filesystem.file.read_opsspark.executor.filesystem.file.read_ops—
spark.executor.filesystem.file.write_bytesspark.executor.filesystem.file.write_bytes—
executor.filesystem.file.write_opsspark.executor.filesystem.file.write_ops—
executor.recordsRead.countspark.executor.recordsRead.count—
executor.recordsWritten.countspark.executor.recordsWritten.count—
executor.succeededTasks.countspark.executor.succeededTasks.count—
applicationMaster
Metric nameMetric keyDescription
applicationMaster.numContainersPendingAllocatespark.applicationMaster.numContainersPendingAllocate—
spark.applicationMaster.numExecutorsFailedspark.applicationMaster.numExecutorsFailed—
applicationMaster.numExecutorsRunningspark.applicationMaster.numExecutorsRunning—
applicationMaster.numLocalityAwareTasksspark.applicationMaster.numLocalityAwareTasks—
applicationMaster.numReleasedContainersspark.applicationMaster.numReleasedContainers—
spark.streaming
Metric nameMetric keyDescription
streaming.inputRate-totalspark.streaming.inputRate-total—
streaming.latencyspark.streaming.latency—
streaming.processingRate-totalspark.streaming.processingRate-total—
streaming.states-rowsTotalspark.streaming.states-rowsTotal—
streaming.states-usedBytesspark.streaming.states-usedBytes—
shuffleService
Metric nameMetric keyDescription
shuffleService.numActiveConnections.countspark.shuffleService.numActiveConnections.count—
shuffleService.numRegisteredConnections.countspark.shuffleService.numRegisteredConnections.count—
shuffleService.numCaughtExceptions.countspark.shuffleService.numCaughtExceptions.count—
shuffleService.registeredExecutorsSizespark.shuffleService.registeredExecutorsSize—
appStatus
Metric nameMetric keyDescription
appStatus.stages.failedStages.countspark.appStatus.stages.failedStages.count—
appStatus.stages.skippedStages.countspark.appStatus.stages.skippedStages.count—
appStatus.stages.completedStages.countspark.appStatus.stages.completedStages.count—
appStatus.tasks.excludedExecutors.countspark.appStatus.tasks.excludedExecutors.count—
appStatus.tasks.completedTasks.countspark.appStatus.tasks.completedTasks.count—
appStatus.tasks.failedTasks.countspark.appStatus.tasks.failedTasks.count—
appStatus.tasks.killedTasks.countspark.appStatus.tasks.killedTasks.count—
appStatus.tasks.skippedTasks.countspark.appStatus.tasks.skippedTasks.count—
spark.appStatus.tasks.unexcludedExecutors.countspark.appStatus.tasks.unexcludedExecutors.count—
appStatus.jobs.succeededJobs.countspark.appStatus.jobs.succeededJobs.count—
appStatus.jobs.failedJobs.countspark.appStatus.jobs.failedJobs.count—
appStatus.jobs.jobDurationspark.appStatus.jobs.jobDuration—
BlockManager
Metric nameMetric keyDescription
BlockManager.disk.diskSpaceUsed_MBspark.BlockManager.disk.diskSpaceUsed_MB—
BlockManager.memory.maxMem_MBspark.BlockManager.memory.maxMem_MB—
BlockManager.memory.maxOffHeapMem_MBspark.BlockManager.memory.maxOffHeapMem_MB—
BlockManager.memory.maxOnHeapMem_MBspark.BlockManager.memory.maxOnHeapMem_MB—
BlockManager.memory.memUsed_MBspark.BlockManager.memory.memUsed_MB—
BlockManager.memory.offHeapMemUsed_MBspark.BlockManager.memory.offHeapMemUsed_MB—
BlockManager.memory.onHeapMemUsed_MBspark.BlockManager.memory.onHeapMemUsed_MB—
BlockManager.memory.remainingMem_MBspark.BlockManager.memory.remainingMem_MB—
BlockManager.memory.remainingOffHeapMem_MBspark.BlockManager.memory.remainingOffHeapMem_MB—
BlockManager.memory.remainingOnHeapMem_MBspark.BlockManager.memory.remainingOnHeapMem_MB—
Hub

Explore in Dynatrace Hub

Monitor Apache Spark cluster performance with JMX metrics for jobs, applications, executors, and resource usage on standalone or YARN clusters.

Related tags
AnalyticsJMXData Processing/AnalyticsApacheInfrastructure Observability