Cisco SD-WAN extension

  • Latest Dynatrace
  • Extension

Monitor and analyze your Cisco SD-WAN (formerly Viptela) infrastructure performance with real-time insights into devices, sites, and network health.

Get started

Overview

The Cisco SD-WAN extension uses the Cisco Catalyst SD-WAN Manager API to collect performance metrics for a Cisco SD-WAN fabric managed by one or multiple Cisco SD-WAN Managers (also known as vManage).

Use cases

  • Monitor your entire network infrastructure with Dynatrace
  • Gain insights into your Cisco Catalyst SD-WAN infrastructure
  • Extend Dynatrace visibility into Cisco SD-WAN fabric performance

Requirements

  • You need access to the Cisco SD-WAN Manager (vManage) API.
  • You need network connectivity between the ActiveGate where you deploy the extension and the vManage systems (vManage virtual IP (VIP) or individual vManage nodes).

Compatibility information

Cisco vManage API version 20.15 and above (tested on 20.15; Cisco documented no breaking changes in the used APIs up to version 20.18).

This integration was implemented and tested on vManage version 20.15. You can expect it to be compatible with version 20.15 and later. Compatibility with earlier versions is not guaranteed, although the extension may function with some earlier versions.

Activation and setup

Cisco SD-WAN Manager configuration

Create a dedicated user. For Dynatrace integration, we recommend creating a dedicated user with read-only privileges.

  1. optional Create a dedicated role (for example, dynatrace-readonly) with read-only privileges. Create role
  2. Create a dedicated user (for example, dynatrace-api) with the dynatrace-readonly role. Create user

Extension activation

To activate remote monitoring of your SD-WAN fabric:

  1. Activate the extension in the Dynatrace Hub.

    Go to Hub, find the Cisco SD-WAN Extension, and select Add to environment.

  2. Add a new monitoring configuration.

    You can configure the extension to use the vManage virtual IP (VIP) or individual vManage node endpoints that are part of the same logical Cisco SD-WAN Manager (vManage) cluster and manage the same SD-WAN fabric.

Details

The extension obtains metrics from the Statistics and Device State APIs. It uses Session-based authentication to authenticate with vManage and access the API.

Extension package contents

  • Extension code (Python) that connects to the vManage API endpoints to collect device state and statistics data
  • Dashboards offering high-level Cisco SD-WAN infrastructure overview
  • Unified Analysis pages for Cisco SD-WAN Infrastructure, sites, devices, device interfaces, and transport locators
  • Analysis screens integrated with the Infrastructure & Operations Infrastructure & Operations app available on the Dynatrace platform

Summary of the entities monitored

  • SD-WAN Infrastructure
  • SD-WAN Sites
  • Devices – based on device state (health, QoE) and statistics data
  • Device Interfaces – based on the Interface statistics
  • Transport Locators (TLOC) – based on the Application-Aware Routing statistics aggregated by Local System IP, Color and Encapsulation Protocol

Extension metrics grouping (Feature sets)

The following feature sets categorize various metrics collected by the Cisco SD-WAN extension:

Feature setDescription
defaultProvides generic network interface and device-specific metrics used in Infrastructure & Operations Infrastructure & Operations.
advanced_interfaceOptional Provides additional generic network interface metrics.
fabric_infraCovers Infrastructure, Sites, and Devices health summary and overall metrics and Inventory state metrics. These metrics correspond to data visualized in the vManage Overview Dashboard.
device_healthCaptures device health and basic performance metrics (health, QoE, CPU load, memory utilization, reachability) for individual devices.
device_statisticsCovers detailed device CPU, memory, processes, and disk metrics (typically reported by vManage API with 1-minute aggregation).
interface_statisticsCaptures interface metrics (typically reported by vManage API with 5-minute aggregation).
approute_statisticsCaptures TLOC metrics (typically reported by vManage API with 10-minute aggregation).
self-monitoringOptional Metrics for fine-tuning and diagnostics of the extension itself.

Extension metrics naming conventions

Generic network interface and device metrics (prefixed with com.dynatrace.extension.network_device) follow naming conventions used in Infrastructure & Operations Infrastructure & Operations.

Extension-specific metric names (prefixed with cisco.sdwan) and their dimension names follow naming conventions used by the vManage API. If you're familiar with the vManage API model, you'll recognize the property names used for these metrics and dimensions.

Licensing and cost

There is no charge for obtaining the extension, only for the data that the extension ingests. The details of license consumption depend on which licensing model you are using: either Dynatrace classic licensing or the Dynatrace Platform Subscription (DPS) model.

Metrics

License consumption is based on the number of metric data points ingested.

The following formula provides approximate annual data points ingested, assuming all feature sets are enabled and assuming default data collection interval and default statistics aggregation times.

(
5 (featureSet: fabric_infra)
+ 4 * <Number of Sites> (featureSet: fabric_infra)
+ 5 * <Number of Devices> (featureSet: device_health)
+ 24 // Optional - Self monitoring metrics (featureSet: self-monitoring, ignoring error counters not reported on normal operation)
) * 60 minutes * 24 hours * 365 days / <collection-interval-default-5min> data points per year
+
(
11 * <Number of Devices> (featureSet: device_statistics)
) * 60 minutes * 24 hours * 365 days / <statistics-aggregation-default-1min> data points per year
+
(
16 * <Number of Interfaces> (featureSet: interface_statistics)
) * 60 minutes * 24 hours * 365 days / <statistics-aggregation-default-5min> data points per year
+
(
7 * <Number of TLOCs> (featureSet: approute_statistics)
) * 60 minutes * 24 hours * 365 days / <statistics-aggregation-default-10min> data points per year

Classic licensing

In the Dynatrace classic licensing model, metric ingestion consumes Davis Data Units (DDUs) at the rate of .001 DDUs per metric data point. Multiply the above metric formula for annual data points by .001 to estimate annual DDU usage by metrics.

FAQ

Do I need to enable all feature sets?

No, not all feature sets are required.

  • self-monitoring is optional and is only needed for fine-tuning and diagnostics of the extension itself.
  • If you want to view PLATFORM screens (generic network device and interface metrics) within Infrastructure & Operations Infrastructure & Operations, only the default feature set is needed.
  • If you want to view both CLASSIC screens and all PLATFORM screens (generic network device and interface metrics and extension-specific metrics), enable fabric_infra, device_health, device_statistics, interface_statistics, and approute_statistics feature sets depending on your needs.
  • advanced_interface is optional (prepared for future use); its metrics are not currently available in the Infrastructure & Operations Infrastructure & Operations nor in the Unified Analysis pages.
How is site, device, tunnel, and application health represented?

The vManage API returns health status for sites, devices, tunnels, and applications as textual values.

  • Site, site devices, tunnels, and applications health: good, fair, poor
  • Device health: green, yellow, red
  • Device reachability: reachable, unreachable

The extension reports health as numeric metric values representing the status returned by the vManage API, using the following mappings.

Site, Site Devices, Site Tunnels, and Applications Health

Health StatusMetric Value
good1
fair3
poor5

Applied to:

  • cisco.sdwan.site.site_health
  • cisco.sdwan.site.devices_health
  • cisco.sdwan.site.tunnels_health
  • cisco.sdwan.site.apps_health

Device Health

Health StatusMetric Value
green1
yellow3
red5

Applied to:

  • cisco.sdwan.device.health

Device Reachability

Reachability StatusMetric Value
reachable1
unreachable0

Applied to:

  • cisco.sdwan.device.reachability
How are Transport Locator (TLOC) metrics calculated?

The extension reports TLOC metrics by aggregating data retrieved from the vManage API. Average latency, loss, and jitter values are calculated using the vManage API query language and are aggregated by the following TLOC attributes:

  • local system IP (local_system_ip)
  • local color (local_color)
  • encapsulation (proto)

The metrics are queried from Application-Aware Routing (App-Route) statistics. Aggregation is performed across all tunnels associated with the same local TLOC.

The number of tunnels for a TLOC is determined by evaluating the cardinality of remote system IPs associated with the aggregated App-Route statistics.

How does the extension handle API rate limiting on vManage?
  • The extension does not use the Real-Time Monitoring APIs, as these are highly CPU-intensive and subject to strict internal throttling.
  • The extension does not use the Bulk Statistics APIs, which are intended for large dataset retrieval and are limited to 48 bulk requests per minute and 2 concurrent bulk statistics requests.

The extension primarily uses Cisco SD-WAN Statistics / Monitoring APIs (non-bulk) to collect operational data. According to Cisco SD-WAN documentation, these APIs are subject to a general rate limit of 100 requests per second.

To ensure stable and predictable operation, the extension applies its own internal API rate limiting, which you can configure by setting API rate limit per minute per one SD-WAN Manager Connection in the extension settings. The default rate limit is 16 requests per minute, with a maximum supported value of 48 requests per minute.

The internal API rate limit has a deliberately conservative maximum value of 48 requests per minute. This limit ensures that the extension does not overload your monitored SD-WAN infrastructure with REST API calls.

Data is periodically collected from the Cisco SD-WAN Manager at a configurable Metrics Collection Frequency, which is 5 minutes by default. During each collection cycle, API requests are evenly distributed over time to stay within the configured rate limit.

For a single Cisco SD-WAN deployment, API rate limits apply to the SD-WAN fabric as a whole, not to individual vManage nodes. As a result, you should treat API rate limits as fabric-wide limits.

You can configure the extension to use the vManage virtual IP (VIP) or load-balanced endpoint, which is the recommended access method for high-availability deployments. This approach relies on vManage's internal load-balancing and failover mechanisms.

Alternatively, you can configure multiple vManage node endpoints, provided they belong to the same vManage cluster and manage the same SD-WAN fabric. In this configuration, all endpoints return identical fabric-wide data, and the extension automatically distributes API requests evenly across the configured nodes.

The internal API rate limit configured in the extension applies globally across all configured vManage endpoints. For example, if you configure three vManage node endpoints and set the rate limit to 16 requests per minute, the extension distributes API calls across the nodes, resulting in approximately 5 requests per minute per node while maintaining the overall fabric-wide rate limit.

If the extension detects that one of the configured vManage nodes is not reachable, it automatically stops using that endpoint and continues data collection using the remaining nodes without increasing the per-node request rate. Once the unavailable node becomes reachable again, the extension automatically resumes using all configured nodes for data collection.

Feature sets

When activating your extension using monitoring configuration, you can limit monitoring to one of the feature sets. To work properly, the extension has to collect at least one metric after the activation.

In highly segmented networks, feature sets can reflect the segments of your environment. Then, when you create a monitoring configuration, you can select a feature set and a corresponding ActiveGate group that can connect to this particular segment.

All metrics that aren't categorized into any feature set are considered to be the default and are always reported.

A metric inherits the feature set of a subgroup, which in turn inherits the feature set of a group. Also, the feature set defined on the metric level overrides the feature set defined on the subgroup level, which in turn overrides the feature set defined on the group level.

fabricInfra
Metric nameMetric keyDescription
Sites Healthcisco.sdwan.fabric_infra.site_health_count.gaugeNumber of Sites per Health Code
Devices Healthcisco.sdwan.fabric_infra.device_health_count.gaugeNumber of Devices per Health Code
Hardware Statuscisco.sdwan.fabric_infra.hardware_summary_count.gaugeNumber of Devices per Hardware Status Code
Inventory Statuscisco.sdwan.fabric_infra.inventory_summary_count.gaugeNumber of Devices per Inventory Status Code
Tunnels Healthcisco.sdwan.fabric_infra.tunnel_health_count.gaugeNumber of Tunnels per Health Code
Site Healthcisco.sdwan.site.site_healthSite Health Code Value (1 - Good, 3 - Fair, 5 - Poor)
Site's Devices Healthcisco.sdwan.site.devices_healthSite's Devices Health Code Value (1 - Good, 3 - Fair, 5 - Poor)
Site's Tunnels Healthcisco.sdwan.site.tunnels_healthSite's Tunnels Health Code Value (1 - Good, 3 - Fair, 5 - Poor)
Site's Apps Usagecisco.sdwan.site.apps_usageSite's Apps Usage
deviceStatistics
Metric nameMetric keyDescription
CPU Systemcisco.sdwan.device.cpu_systemSD-WAN Device CPU System Utilization
CPU Usercisco.sdwan.device.cpu_userSD-WAN Device CPU User Utilization
CPU Idlecisco.sdwan.device.cpu_idleSD-WAN Device CPU Idle
Memory Usedcisco.sdwan.device.mem_usedSD-WAN Device Memory Used
Memory Freecisco.sdwan.device.mem_freeSD-WAN Device Memory Free
Memory Bufferscisco.sdwan.device.mem_buffersSD-WAN Device Memory Buffers
Memory Cachedcisco.sdwan.device.mem_cachedSD-WAN Device Memory Cached
Processes Totalcisco.sdwan.device.totalpSD-WAN Device Total Processes
Processes Runningcisco.sdwan.device.runningpSD-WAN Device Running Processes
Disk Usedcisco.sdwan.device.disk_usedSD-WAN Device Disk Used
Disk Availablecisco.sdwan.device.disk_availSD-WAN Device Disk Available
self-monitoring
Metric nameMetric keyDescription
Total duration of all rate-limited executed queries to all connection urls to collect data reported as metricssfm.cisco.sdwan.monitor.run.total.duration
Total duration of all rate-limited synchronously executed queries to all connection urls to collect data reported as metricssfm.cisco.sdwan.monitor.run.sync.duration
Metrics Data Collection Overall Asynchronous Timesfm.cisco.sdwan.monitor.run.async.durationTotal duration of all rate-limited asynchronously executed queries to all connection urls to collect data reported as metrics
Metrics Data Collection Errorsfm.cisco.sdwan.monitor.run.errorIndicates whether an unhandled error occurred during execution of Metrics Data Collection.
get_site_health_stats Query Timesfm.cisco.sdwan.get_site_health_stats.duration
get_site_health_stats Errorsfm.cisco.sdwan.get_site_health_stats.errorIndicates whether an error occurred during get_site_health_stats Query
report_site_health_stats Overall Timesfm.cisco.sdwan.report_site_health_stats.duration
report_site_health_stats Errorsfm.cisco.sdwan.report_site_health_stats.errorIndicates whether an error occurred during report_site_health_stats
get_vedge_inventory_summary Query Timesfm.cisco.sdwan.get_vedge_inventory_summary.duration
get_vedge_inventory_summary Errorsfm.cisco.sdwan.get_vedge_inventory_summary.errorIndicates whether an error occurred during get_vedge_inventory_summary Query
report_vedge_inventory_summary Overall Timesfm.cisco.sdwan.report_vedge_inventory_summary.duration
report_vedge_inventory_summary Errorsfm.cisco.sdwan.report_vedge_inventory_summary.errorIndicates whether an error occurred during report_vedge_inventory_summary
get_device_health_overview_stats Query Timesfm.cisco.sdwan.get_device_health_overview_stats.duration
get_device_health_overview_stats Errorsfm.cisco.sdwan.get_device_health_overview_stats.errorIndicates whether an error occurred during get_device_health_overview_stats Query
report_device_health_overview Overall Timesfm.cisco.sdwan.report_device_health_overview.duration
report_device_health_overview Errorsfm.cisco.sdwan.report_device_health_overview.errorIndicates whether an error occurred during report_device_health_overview
get_hardware_health_summary Query Timesfm.cisco.sdwan.get_hardware_health_summary.duration
get_hardware_health_summary Errorsfm.cisco.sdwan.get_hardware_health_summary.errorIndicates whether an error occurred during get_hardware_health_summary Query
report_hardware_health_summary Overall Timesfm.cisco.sdwan.report_hardware_health_summary.duration
report_hardware_health_summary Errorsfm.cisco.sdwan.report_hardware_health_summary.errorIndicates whether an error occurred during report_hardware_health_summary
get_remote_system_cardinality Query Timesfm.cisco.sdwan.get_remote_system_cardinality.duration
get_remote_system_cardinality Errorsfm.cisco.sdwan.get_remote_system_cardinality.errorIndicates whether an error occurred during get_approute_statistics_remote_system_cardinality Query
report_remote_system_cardinality Overall Timesfm.cisco.sdwan.report_remote_system_cardinality.duration
report_remote_system_cardinality Errorsfm.cisco.sdwan.report_remote_system_cardinality.errorIndicates whether an error occurred during report_remote_system_cardinality
get_tunnel_health_overview Query Timesfm.cisco.sdwan.get_tunnel_health_overview.duration
get_tunnel_health_overview Errorsfm.cisco.sdwan.get_tunnel_health_overview.errorIndicates whether an error occurred during get_tunnel_health_overview Query
report_tunnel_health_overview Overall Timesfm.cisco.sdwan.report_tunnel_health_overview.duration
report_tunnel_health_overview Errorsfm.cisco.sdwan.report_tunnel_health_overview.errorIndicates whether an error occurred during report_tunnel_health_overview
get_approute_statistics_aggregation Query Timesfm.cisco.sdwan.get_approute_statistics_aggregation.duration
get_approute_statistics_aggregation Errorsfm.cisco.sdwan.get_approute_statistics_aggregation.errorIndicates whether an error occurred during get_approute_statistics_aggregation Query
report_statistics_approute Overall Timesfm.cisco.sdwan.report_statistics_approute.duration
report_statistics_approute Errorsfm.cisco.sdwan.report_statistics_approute.errorIndicates whether an error occurred during report_statistics_approute
get_devices_health Query Timesfm.cisco.sdwan.get_devices_health.duration
get_devices_health Errorsfm.cisco.sdwan.get_devices_health.errorIndicates whether an error occurred during get_devices_health Query
report_devices_health Overall Timesfm.cisco.sdwan.report_devices_health.duration
report_devices_health Errorsfm.cisco.sdwan.report_devices_health.errorIndicates whether an error occurred during report_devices_health
get_statistics_system_aggregation Query Timesfm.cisco.sdwan.get_statistics_system_aggregation.duration
get_statistics_system_aggregation Errorsfm.cisco.sdwan.get_statistics_system_aggregation.errorIndicates whether an error occurred during get_statistics_system_aggregation Query
get_interface_statistics Query Timesfm.cisco.sdwan.get_interface_statistics.duration
get_interface_statistics Errorsfm.cisco.sdwan.get_interface_statistics.errorIndicates whether an error occurred during get_interface_statistics Query
report_statistics_system Overall Timesfm.cisco.sdwan.report_statistics_system.duration
report_statistics_system Errorsfm.cisco.sdwan.report_statistics_system.errorIndicates whether an error occurred during report_statistics_system
report_statistics_interface Overall Timesfm.cisco.sdwan.report_statistics_interface.duration
report_statistics_interface Errorsfm.cisco.sdwan.report_statistics_interface.errorIndicates whether an error occurred during report_statistics_interface
advancedInterface
Metric nameMetric keyDescription
com.dynatrace.extension.network_device.if.in.pkts.count
com.dynatrace.extension.network_device.if.out.pkts.count
deviceHealth
Metric nameMetric keyDescription
CPU Loadcisco.sdwan.device.cpu_loadSD-WAN Device CPU Load
Memory Utilizationcisco.sdwan.device.memory_utilizationSD-WAN Device Memory Utilization
Device Healthcisco.sdwan.device.healthSD-WAN Device Health Code Value (1 - green, 3 - yellow, 5 - red)
Device Reachabilitycisco.sdwan.device.reachabilitySD-WAN Device Reachability Code Value (1 - reachable, 0 - unreachable)
Device QoEcisco.sdwan.device.qoeSD-WAN Device QoE Score
default
Metric nameMetric keyDescription
com.dynatrace.extension.network_device.cpu_usage
com.dynatrace.extension.network_device.memory_usage
com.dynatrace.extension.network_device.sysuptime
com.dynatrace.extension.network_device.memory_used
com.dynatrace.extension.network_device.memory_free
com.dynatrace.extension.network_device.if.status
com.dynatrace.extension.network_device.if.bytes_in.count
com.dynatrace.extension.network_device.if.bytes_out.count
com.dynatrace.extension.network_device.if.in.errors.count
com.dynatrace.extension.network_device.if.out.errors.count
com.dynatrace.extension.network_device.if.in.discards.count
com.dynatrace.extension.network_device.if.out.discards.count
interfaceStatistics
Metric nameMetric keyDescription
Admin Statuscisco.sdwan.interface.admin_statusSD-WAN Interface Admin Status
Oper Statuscisco.sdwan.interface.oper_statusSD-WAN Interface Oper Status
RX Packetscisco.sdwan.interface.rx_pktsSD-WAN Interface Packets Recieved
RX Octetscisco.sdwan.interface.rx_octetsSD-WAN Interface Octets Recieved
RX Errorscisco.sdwan.interface.rx_errorsSD-WAN Interface RX Errors
RX Dropscisco.sdwan.interface.rx_dropsSD-WAN Interface RX Drops
RX Packets Per Secondcisco.sdwan.interface.rx_ppsSD-WAN Interface Recieved Packets Per Second
RX KiloBit Per Secondcisco.sdwan.interface.rx_kbpsSD-WAN Interface Recieved KiloBit Per Second
TX Packetscisco.sdwan.interface.tx_pktsSD-WAN Interface Packets Transmitted
TX Octetscisco.sdwan.interface.tx_octetsSD-WAN Interface Octets Transmitted
TX Errorscisco.sdwan.interface.tx_errorsSD-WAN Interface TX Errors
TX Dropscisco.sdwan.interface.tx_dropsSD-WAN Interface TX Drops
TX Packets Per Secondcisco.sdwan.interface.tx_ppsSD-WAN Interface Transmitted Packets Per Second
TX KiloBit Per Secondcisco.sdwan.interface.tx_kbpsSD-WAN Interface Transmitted KiloBit Per Second
Uplink Capacity Utilizationcisco.sdwan.interface.up_cap_percSD-WAN Interface Uplink Capacity Utilization
Downlink Capacity Utilizationcisco.sdwan.interface.down_cap_percSD-WAN Interface Downlink Capacity Usage
approuteStatistics
Metric nameMetric keyDescription
TLOC QoEcisco.sdwan.tloc.vqoe_scoreSD-WAN TLOC QoE Score aggregated as average across all mapped tunnels
TLOC Losscisco.sdwan.tloc.loss_percentageSD-WAN TLOC Loss Percentage aggregated as average across all mapped tunnels
TLOC Latencycisco.sdwan.tloc.latencySD-WAN TLOC Latency aggregated as average across all mapped tunnels
TLOC Jittercisco.sdwan.tloc.jitterSD-WAN TLOC Jitter aggregated as average across all mapped tunnels
TLOC TX Octetscisco.sdwan.tloc.tx_octetsSD-WAN TLOC TX Octets aggregated as sum across all mapped tunnels
TLOC RX Octetscisco.sdwan.tloc.rx_octetsSD-WAN TLOC RX Octets aggregated as sum across all mapped tunnels
TLOC Tunnelscisco.sdwan.tloc.tunnel_count.gaugeSD-WAN TLOC Tunnels determined by the cardinality of the Remote System IPs
Related tags
NetworkPythonNetwork managementCiscoInfrastructure Observability