Elasticsearch monitoring
This extension documentation is now deprecated and will no longer be updated. We recommend using the new Elasticsearch extension for improved functionality and support.
Dynatrace Elasticsearch monitoring provides a high-level overview of all Elasticsearch components within each monitored cluster in your environment.
Elasticsearch health metrics tell you everything you need to know about the health of your monitored clusters. When a problem occurs, it’s easy to see which nodes are affected. And it’s easy to drill down into the metrics of individual nodes to find the root cause of problems and potential bottlenecks.
Prerequisites
Elasticsearch monitoring in Dynatrace requires:
- Elasticsearch 2.3+
- Linux OS or Windows
- OneAgent installed on all Elasticsearch nodes
Docker support
Dynatrace supports Elasticsearch running inside Docker container with OneAgent version 1.157+.
Image
You need to have an Elasticsearch Docker image, version 6.0.1+.
Container configuration
- All instances must have the same port used for the REST API (default:
9200
). You can change this via thehttp.port
variable. - The REST port must be exposed to the host.
Example configuration:
docker run -p 45709:1200 -e "discovery.type=single-node" -e "http.port=1200" docker.elastic.co/elasticsearch/elasticsearch:6.0.1
Enabling Elasticsearch monitoring globally
With Elasticsearch monitoring enabled globally, Dynatrace automatically collects Elasticsearch metrics whenever a new host running Elasticsearch is detected in your environment.
- Go to Settings.
- Select Monitoring > Monitored technologies.
- Find the Elasticsearch entry and expand it for editing.
- User and Password are user credentials that have to work for all Elasticsearch hosts that you want to monitor. Leave them empty if no authentication is set up.
- URL is the Elasticsearch URL.
If you decide to instead configure Elasticsearch per host rather than globally, set the global Elasticsearch switch here to the Off position and click the host settings link to begin configuring Elasticsearch at the host level. See Enabling Elasticsearch monitoring for individual hosts below for details.
- Select Save to save any changes.
- Turn on the Elasticsearch switch to enable Elasticsearch monitoring globally.
Enabling Elasticsearch monitoring for individual hosts
Dynatrace also offers the option of enabling Elasticsearch monitoring for specific hosts rather than globally.
- If Elasticsearch monitoring is currently switched on, switch it off: go to Settings > Monitoring > Monitored technologies and set the Elasticsearch switch to the Off position.
- Go to Settings.
- Select Monitoring > Monitoring overview.
- Select the Hosts tab.
- Find the host on which you want to enable Elasticsearch monitoring and select Edit.
- The Monitored technologies section shows:
- A list of technologies that are currently being monitored globally. This list should not include Elasticsearch if you are configuring Elasticsearch monitoring for individual hosts.
- A table of technologies that you can enable on this host: Technology, Type, Monitoring, Edit.
- Find Elasticsearch in the Technology column and click in the Edit column to display Elasticsearch host-level configuration settings.
- User and Password are user credentials for Elasticsearch on this host. Leave them empty if no authentication is set up.
- URL is the Elasticsearch URL.
- Select Save to save any changes.
- Turn on the Elasticsearch switch to enable Elasticsearch monitoring for the selected host.
Viewing Elasticsearch monitoring insights
- Go to Technologies & Processes or Technologies & Processes Classic (latest Dynatrace).
- In the Technology overview section, select the Elasticsearch tile.
Individual Elasticsearch clusters are represented as process groups. All detected Elasticsearch process groups are listed in the table at the bottom of the page. - To view metrics for a specific cluster, locate it in the table and select in the Details column to expand that row.
A chart shows the number of process group instances over the selected time range. - To see details, select the Process group details button.
- On the Process group details page, in addition to system performance and networking metrics, you can select the Technology-specific metrics tab to display Elasticsearch cluster charts and metrics.
- Change the Show chart for selection to chart a different Elasticsearch cluster metric.
- All processes in the selected process group are listed at the bottom.
- Select a process to display the details page for the selected process. In addition to general process status information, it has two Elasticsearch-specific tabs: Elasticsearch metrics and Further details.
- Select the Elasticsearch metrics tab to display charts for Elasticsearch key metrics:
- Indexing (indexing total over the selected time range) shows the effectiveness of all indexing operations.
- Search (number of queries, fetches, and scrolls over the same time range) is an indicator of how efficient your search operations are. More operations in a shorter time interval indicates better performance.
- Select the Further details tab to display the Process metrics page, which charts essential Elasticsearch metrics. You can filter these charts by cluster and node.
- Breakers
Elasticsearch circuit breakers are thresholds used to prevent operations from causing OutOfMemoryError errors. Each breaker specifies a limit for how much memory it can use. If the estimated query size is larger than the limit, the circuit breaker is tripped, the query is aborted, and an exception is returned. This happens before data is loaded, which means that anOutOfMemoryException
is avoided.- Limit size
- Estimated size
- Overhead
- Tripped
- Indices
Shows additional in-depth information about Elasticsearch indices. Of particular interest is the Translog chart, which shows whether Elasticsearch is keeping up with the data coming in by flushing it out to the indices on disk. - Merge
Can show the root cause of problems when a system is under too much load and merging can’t keep up. - Search
Shows additional in-depth information around Elasticsearch search operations, with performance charts for queries, fetches, and scrolls. - Thread pools
Shows details about how much load the system is currently processing. Enables you to see if you can increase the rate of queries or the amount of writes. Also enables you to see if there’s a bottleneck in one of the thread pools.
- Breakers
Supported metrics
These tables list all supported Elasticsearch metrics. A full description of all Elasticsearch statistics is available at www.elastic.co. Most Elasticsearch metrics are taken directly from Elasticsearch statistics and presented as is, with no additional computation.
Process group metrics
Process group metric
Description
status-green
Status green
status-yellow
Status yellow
status-red
Status red
status-unknown
Status unknown
number_of_nodes
Number of nodes
number_of_data_nodes
Number of data nodes
active_primary_shards
Active primary shards
active_shards
Active shards
relocating_shards
Relocating shards
initializing_shards
Initializing shards
unassigned_shards
Unassigned shards
delayed_unassigned_shards
Delayed unassigned shards
indices.count
Indices count
indices.shards.replication
Replica shards
indices.docs.count
Documents count
indices.docs.deleted
Deleted documents
indices.fielddata.memory_size_in_bytes
Field data size
indices.fielddata.evictions
Field data evictions
indices.query_cache.cache_size
Query cache size
indices.query_cache.cache_count
Query cache count
indices.query_cache.evictions
Query cache evictions
indices.segments.count
Segment count
Instance metrics
Instance metric
Description
node.indices.store.size_in_bytes
Store size
node.indices.store.throttle_time_in_millis
Store throttle time
node.indices.indexing.throttle_time_in_millis
Indexing throttle time
node.indices.indexing.index_time_in_millis
Indexing time
node.indices.indexing.index_total
Indexing total
node.indices.indexing.delete_total
Indexing delete
node.indices.indexing.index_failed
Indexing failed
node.indices.indexing.noop_update_total
Indexing noop update total
node.indices.search.query_time_in_millis
Query time
node.indices.search.query_total
Number of queries
node.indices.search.fetch_total
Number of fetches
node.indices.search.fetch_time_in_millis
Fetch time
node.indices.search.scroll_time_in_millis
Scroll time
node.indices.search.scroll_total
Number of scrolls
node.indices.search.local_total_time_in_millis
Total search time
node.indices.merges.total
Merge total
node.indices.merges.total_time_in_millis
Merge total time
node.indices.merges.total_docs
Merge total documents
node.indices.merges.total_size_in_bytes
Merge total size
node.indices.merges.total_stopped_time_in_millis
Merge stopped time
node.indices.merges.total_throttled_time_in_millis
Merge throttled time
node.indices.merges.total_auto_throttle_in_bytes
Merge auto throttle size
node.indices.refresh.total
Indicies refresh total
node.indices.refresh.total_time_in_millis
Indicies refresh time
node.indices.flush.total
Indices flush total
node.indices.flush.total_time_in_millis
Indices flush time
node.indices.warmer.total
Indices warmer total
node.indices.warmer.total_time_in_millis
Indices warmer time
node.indices.translog.operations
Indices translog operations
node.indices.translog.size_in_bytes
Indices translog size
node.indices.suggest.total
Indices suggest total
node.indices.suggest.time_in_millis
Indices suggest time
node.indices.request_cache.memory_size_in_bytes
Indices request cache size
node.indices.request_cache.evictions
Indices request cache evictions
node.indices.request_cache.hit_count
Indices request cache hit count
node.indices.request_cache.miss_count
Indices request cache miss count
node.indices.recovery.current_as_source
Indices recovery current as source
node.indices.recovery.current_as_target
Indices recovery current as target
node.indices.recovery.throttle_time_in_millis
Indices recovery throttle time
node.breakers.request.limit_size_in_bytes
Breakers request limit size
node.breakers.request.estimated_size_in_bytes
Breakers request estimated size
node.breakers.request.overhead
Breakers request overhead
node.breakers.request.tripped
Breakers request tripped
node.breakers.fielddata.limit_size_in_bytes
Breakers field data limit size
node.breakers.fielddata.estimated_size_in_bytes
Breakers field data estimated size
node.breakers.fielddata.overhead
Breakers field data overhead
node.breakers.fielddata.tripped
Breakers field data tripped
node.breakers.parent.limit_size_in_bytes
Breakers parent data limit size
node.breakers.parent.estimated_size_in_bytes
Breakers parent data estimated size
node.breakers.parent.overhead
Breakers parent data overhead
node.breakers.parent.tripped
Breakers parent data tripped
node.thread_pool.percolate.queue
Thread pools percolate queue
node.thread_pool.percolate.completed
Thread pools percolate completed
node.thread_pool.percolate.threads
Thread pools percolate threads
node.thread_pool.percolate.rejected
Thread pools percolate rejected
node.thread_pool.listener.queue
Thread pools listener queue
node.thread_pool.listener.completed
Thread pools listener completed
node.thread_pool.listener.threads
Thread pools listener threads
node.thread_pool.listener.rejected
Thread pools listener rejected
node.thread_pool.search.queue
Thread pools search queue
node.thread_pool.search.completed
Thread pools search completed
node.thread_pool.search.threads
Thread pools search threads
node.thread_pool.search.rejected
Thread pools search rejected
node.thread_pool.get.queue
Thread pools get queue
node.thread_pool.get.completed
Thread pools get completed
node.thread_pool.get.threads
Thread pools get threads
node.thread_pool.get.rejected
Thread pools get rejected
node.thread_pool.bulk.queue
Thread pools bulk queue
node.thread_pool.bulk.completed
Thread pools bulk completed
node.thread_pool.bulk.threads
Thread pools bulk threads
node.thread_pool.bulk.rejected
Thread pools bulk rejected
node.thread_pool.index.queue
Thread pools index queue
node.thread_pool.index.completed
Thread pools index completed
node.thread_pool.index.threads
Thread pools index threads
node.thread_pool.index.rejected
Thread pools index rejected
node.thread_pool.force_merge.queue
Thread pools force merge queue
node.thread_pool.force_merge.completed
Thread pools force merge completed
node.thread_pool.force_merge.threads
Thread pools force merge threads
node.thread_pool.force_merge.rejected
Thread pools force merge rejected
node.thread_pool.analyze.queue
Thread pools analyze queue
node.thread_pool.analyze.completed
Thread pools analyze completed
node.thread_pool.analyze.threads
Thread pools analyze threads
node.thread_pool.analyze.rejected
Thread pools analyze rejected
node.thread_pool.refresh.queue
Thread pools refresh queue
node.thread_pool.refresh.completed
Thread pools refresh completed
node.thread_pool.refresh.threads
Thread pools refresh threads
node.thread_pool.refresh.rejected
Thread pools refresh rejected
node.thread_pool.generic.queue
Thread pools generic queue
node.thread_pool.generic.completed
Thread pools generic completed
node.thread_pool.generic.threads
Thread pools generic threads
node.thread_pool.generic.rejected
Thread pools generic rejected
node.thread_pool.flush.queue
Thread pools flush queue
node.thread_pool.flush.completed
Thread pools flush completed
node.thread_pool.flush.threads
Thread pools flush threads
node.thread_pool.flush.rejected
Thread pools flush rejected
node.thread_pool.write.queue
Thread pools write queue
node.thread_pool.write.completed
Thread pools write completed
node.thread_pool.write.threads
Thread pools write threads
node.thread_pool.write.rejected
Thread pools write rejected
node.thread_pool.snapshot.queue
Thread pools snapshot queue
node.thread_pool.snapshot.completed
Thread pools snapshot completed
node.thread_pool.snapshot.threads
Thread pools snapshot threads
node.thread_pool.snapshot.rejected
Thread pools snapshot rejected
node.thread_pool.ccr.queue
Thread pools ccr queue
node.thread_pool.ccr.completed
Thread pools ccr completed
node.thread_pool.ccr.threads
Thread pools ccr threads
node.thread_pool.ccr.rejected
Thread pools ccr rejected