Amazon Kinesis (Data Analytics, Data Firehose, Data Streams, Video Streams) monitoring

  • How-to guide
  • 33-min read

Dynatrace ingests metrics for multiple preselected namespaces, including Amazon Kinesis. You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.

Prerequisites

To enable monitoring for this service, you need

  • ActiveGate version 1.181+, as follows:

    • For Dynatrace SaaS deployments, you need an Environment ActiveGate or a Multi-environment ActiveGate.

    • For Dynatrace Managed deployments, you can use any kind of ActiveGate.

      For role-based access (whether in a SaaS or Managed deployment), you need an Environment ActiveGate installed on an Amazon EC2 host.

  • Dynatrace version 1.182+

  • An updated AWS monitoring policy to include the additional AWS services.
    To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"acm-pca:ListCertificateAuthorities",
"apigateway:GET",
"apprunner:ListServices",
"appstream:DescribeFleets",
"appsync:ListGraphqlApis",
"athena:ListWorkGroups",
"autoscaling:DescribeAutoScalingGroups",
"cloudformation:ListStackResources",
"cloudfront:ListDistributions",
"cloudhsm:DescribeClusters",
"cloudsearch:DescribeDomains",
"cloudwatch:GetMetricData",
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics",
"codebuild:ListProjects",
"datasync:ListTasks",
"dax:DescribeClusters",
"directconnect:DescribeConnections",
"dms:DescribeReplicationInstances",
"dynamodb:ListTables",
"dynamodb:ListTagsOfResource",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInstances",
"ec2:DescribeNatGateways",
"ec2:DescribeSpotFleetRequests",
"ec2:DescribeTransitGateways",
"ec2:DescribeVolumes",
"ec2:DescribeVpnConnections",
"ecs:ListClusters",
"eks:ListClusters",
"elasticache:DescribeCacheClusters",
"elasticbeanstalk:DescribeEnvironmentResources",
"elasticbeanstalk:DescribeEnvironments",
"elasticfilesystem:DescribeFileSystems",
"elasticloadbalancing:DescribeInstanceHealth",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:DescribeRules",
"elasticloadbalancing:DescribeTags",
"elasticloadbalancing:DescribeTargetHealth",
"elasticmapreduce:ListClusters",
"elastictranscoder:ListPipelines",
"es:ListDomainNames",
"events:ListEventBuses",
"firehose:ListDeliveryStreams",
"fsx:DescribeFileSystems",
"gamelift:ListFleets",
"glue:GetJobs",
"inspector:ListAssessmentTemplates",
"kafka:ListClusters",
"kinesis:ListStreams",
"kinesisanalytics:ListApplications",
"kinesisvideo:ListStreams",
"lambda:ListFunctions",
"lambda:ListTags",
"lex:GetBots",
"logs:DescribeLogGroups",
"mediaconnect:ListFlows",
"mediaconvert:DescribeEndpoints",
"mediapackage-vod:ListPackagingConfigurations",
"mediapackage:ListChannels",
"mediatailor:ListPlaybackConfigurations",
"opsworks:DescribeStacks",
"qldb:ListLedgers",
"rds:DescribeDBClusters",
"rds:DescribeDBInstances",
"rds:DescribeEvents",
"rds:ListTagsForResource",
"redshift:DescribeClusters",
"robomaker:ListSimulationJobs",
"route53:ListHostedZones",
"route53resolver:ListResolverEndpoints",
"s3:ListAllMyBuckets",
"sagemaker:ListEndpoints",
"sns:ListTopics",
"sqs:ListQueues",
"storagegateway:ListGateways",
"sts:GetCallerIdentity",
"swf:ListDomains",
"tag:GetResources",
"tag:GetTagKeys",
"transfer:ListServers",
"workmail:ListOrganizations",
"workspaces:DescribeWorkspaces"
],
"Resource": "*"
}
]
}

If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for All AWS cloud services and, for each supporting service, a list of optional permissions specific to that service.

Permissions required for AWS monitoring integration:
  • "cloudwatch:GetMetricData"
  • "cloudwatch:GetMetricStatistics"
  • "cloudwatch:ListMetrics"
  • "sts:GetCallerIdentity"
  • "tag:GetResources"
  • "tag:GetTagKeys"
  • "ec2:DescribeAvailabilityZones"
NamePermissions
All monitored Amazon services requiredcloudwatch:GetMetricData,
cloudwatch:GetMetricStatistics,
cloudwatch:ListMetrics,
sts:GetCallerIdentity,
tag:GetResources,
tag:GetTagKeys,
ec2:DescribeAvailabilityZones
AWS Certificate Manager Private Certificate Authorityacm-pca:ListCertificateAuthorities
Amazon MQ
Amazon API Gatewayapigateway:GET
AWS App Runnerapprunner:ListServices
Amazon AppStreamappstream:DescribeFleets
AWS AppSyncappsync:ListGraphqlApis
Amazon Athenaathena:ListWorkGroups
Amazon Aurorards:DescribeDBClusters
Amazon EC2 Auto Scalingautoscaling:DescribeAutoScalingGroups
Amazon EC2 Auto Scaling (built-in)autoscaling:DescribeAutoScalingGroups
AWS Billing
Amazon Keyspaces
AWS Chatbot
Amazon CloudFrontcloudfront:ListDistributions
AWS CloudHSMcloudhsm:DescribeClusters
Amazon CloudSearchcloudsearch:DescribeDomains
AWS CodeBuildcodebuild:ListProjects
Amazon Cognito
Amazon Connect
Amazon Elastic Kubernetes Service (EKS)eks:ListClusters
AWS DataSyncdatasync:ListTasks
Amazon DynamoDB Accelerator (DAX)dax:DescribeClusters
AWS Database Migration Service (AWS DMS)dms:DescribeReplicationInstances
Amazon DocumentDBrds:DescribeDBClusters
AWS Direct Connectdirectconnect:DescribeConnections
Amazon DynamoDBdynamodb:ListTables
Amazon DynamoDB (built-in)dynamodb:ListTables,
dynamodb:ListTagsOfResource
Amazon EBSec2:DescribeVolumes
Amazon EBS (built-in)ec2:DescribeVolumes
Amazon EC2 API
Amazon EC2 (built-in)ec2:DescribeInstances
Amazon EC2 Spot Fleetec2:DescribeSpotFleetRequests
Amazon Elastic Container Service (ECS)ecs:ListClusters
Amazon ECS Container Insightsecs:ListClusters
Amazon ElastiCache (EC)elasticache:DescribeCacheClusters
AWS Elastic Beanstalkelasticbeanstalk:DescribeEnvironments
Amazon Elastic File System (EFS)elasticfilesystem:DescribeFileSystems
Amazon Elastic Inference
Amazon Elastic Map Reduce (EMR)elasticmapreduce:ListClusters
Amazon Elasticsearch Service (ES)es:ListDomainNames
Amazon Elastic Transcoderelastictranscoder:ListPipelines
Amazon Elastic Load Balancer (ELB) (built-in)elasticloadbalancing:DescribeInstanceHealth,
elasticloadbalancing:DescribeListeners,
elasticloadbalancing:DescribeLoadBalancers,
elasticloadbalancing:DescribeRules,
elasticloadbalancing:DescribeTags,
elasticloadbalancing:DescribeTargetHealth
Amazon EventBridgeevents:ListEventBuses
Amazon FSxfsx:DescribeFileSystems
Amazon GameLiftgamelift:ListFleets
AWS Glueglue:GetJobs
Amazon Inspectorinspector:ListAssessmentTemplates
AWS Internet of Things (IoT)
AWS IoT Analytics
Amazon Managed Streaming for Kafkakafka:ListClusters
Amazon Kinesis Data Analyticskinesisanalytics:ListApplications
Amazon Data Firehosefirehose:ListDeliveryStreams
Amazon Kinesis Data Streamskinesis:ListStreams
Amazon Kinesis Video Streamskinesisvideo:ListStreams
AWS Lambdalambda:ListFunctions
AWS Lambda (built-in)lambda:ListFunctions,
lambda:ListTags
Amazon Lexlex:GetBots
Amazon Application and Network Load Balancer (built-in)elasticloadbalancing:DescribeInstanceHealth,
elasticloadbalancing:DescribeListeners,
elasticloadbalancing:DescribeLoadBalancers,
elasticloadbalancing:DescribeRules,
elasticloadbalancing:DescribeTags,
elasticloadbalancing:DescribeTargetHealth
Amazon CloudWatch Logslogs:DescribeLogGroups
AWS Elemental MediaConnectmediaconnect:ListFlows
AWS Elemental MediaConvertmediaconvert:DescribeEndpoints
AWS Elemental MediaPackage Livemediapackage:ListChannels
AWS Elemental MediaPackage Video on Demandmediapackage-vod:ListPackagingConfigurations
AWS Elemental MediaTailormediatailor:ListPlaybackConfigurations
Amazon VPC NAT Gatewaysec2:DescribeNatGateways
Amazon Neptunerds:DescribeDBClusters
AWS OpsWorksopsworks:DescribeStacks
Amazon Polly
Amazon QLDBqldb:ListLedgers
Amazon RDSrds:DescribeDBInstances
Amazon RDS (built-in)rds:DescribeDBInstances,
rds:DescribeEvents,
rds:ListTagsForResource
Amazon Redshiftredshift:DescribeClusters
Amazon Rekognition
AWS RoboMakerrobomaker:ListSimulationJobs
Amazon Route 53route53:ListHostedZones
Amazon Route 53 Resolverroute53resolver:ListResolverEndpoints
Amazon S3s3:ListAllMyBuckets
Amazon S3 (built-in)s3:ListAllMyBuckets
Amazon SageMaker Batch Transform Jobs
Amazon SageMaker Endpoint Instancessagemaker:ListEndpoints
Amazon SageMaker Endpointssagemaker:ListEndpoints
Amazon SageMaker Ground Truth
Amazon SageMaker Processing Jobs
Amazon SageMaker Training Jobs
AWS Service Catalog
Amazon Simple Email Service (SES)
Amazon Simple Notification Service (SNS)sns:ListTopics
Amazon Simple Queue Service (SQS)sqs:ListQueues
AWS Systems Manager - Run Command
AWS Step Functions
AWS Storage Gatewaystoragegateway:ListGateways
Amazon SWFswf:ListDomains
Amazon Textract
AWS IoT Things Graph
AWS Transfer Familytransfer:ListServers
AWS Transit Gatewayec2:DescribeTransitGateways
Amazon Translate
AWS Trusted Advisor
AWS API Usage
AWS Site-to-Site VPNec2:DescribeVpnConnections
AWS WAF Classic
AWS WAF
Amazon WorkMailworkmail:ListOrganizations
Amazon WorkSpacesworkspaces:DescribeWorkspaces

Example of JSON policy for one single service.

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"apigateway:GET",
"cloudwatch:GetMetricData",
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics",
"sts:GetCallerIdentity",
"tag:GetResources",
"tag:GetTagKeys",
"ec2:DescribeAvailabilityZones"
],
"Resource": "*"
}
]
}

In this example, from the complete list of permissions you need to select

  • "apigateway:GET" for Amazon API Gateway
  • "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics", "sts:GetCallerIdentity", "tag:GetResources", "tag:GetTagKeys", and "ec2:DescribeAvailabilityZones" for All AWS cloud services.

Enable monitoring

To learn how to enable service monitoring, see Enable service monitoring.

View service metrics

You can view the service metrics in your Dynatrace environment either on the custom device overview page or on your Dashboards page.

View metrics on the custom device overview page

To access the custom device overview page

  1. Go to Technologies & Processes Classic.
  2. Filter by service name and select the relevant custom device group.
  3. Once you select the custom device group, you're on the custom device group overview page.
  4. The custom device group overview page lists all instances (custom devices) belonging to the group. Select an instance to view the custom device overview page.

View metrics on your dashboard

You can also view metrics in the Dynatrace web UI on dashboards. There is no preset dashboard available for this service, but you can create your own dashboard.

To check the availability of preset dashboards for each AWS service, see the list below.

AWS servicePreset dashboard
Amazon EC2 Auto Scaling (built-in)Not applicable
AWS Lambda (built-in)Not applicable
Amazon Application and Network Load Balancer (built-in)Not applicable
Amazon DynamoDB (built-in)Not applicable
Amazon EBS (built-in)Not applicable
Amazon EC2 (built-in)Not applicable
Amazon Elastic Load Balancer (ELB) (built-in)Not applicable
Amazon RDS (built-in)Not applicable
Amazon S3 (built-in)Not applicable
AWS Certificate Manager Private Certificate AuthorityNot applicable
All monitored Amazon servicesNot applicable
Amazon API GatewayNot applicable
AWS App RunnerNot applicable
Amazon AppStreamApplicable
AWS AppSyncApplicable
Amazon AthenaApplicable
Amazon AuroraNot applicable
Amazon EC2 Auto ScalingApplicable
AWS BillingApplicable
Amazon KeyspacesApplicable
AWS ChatbotApplicable
Amazon CloudFrontNot applicable
AWS CloudHSMApplicable
Amazon CloudSearchApplicable
AWS CodeBuildApplicable
Amazon CognitoNot applicable
Amazon ConnectApplicable
AWS DataSyncApplicable
Amazon DynamoDB Accelerator (DAX)Applicable
AWS Database Migration Service (AWS DMS)Applicable
Amazon DocumentDBApplicable
AWS Direct ConnectApplicable
Amazon DynamoDBNot applicable
Amazon EBSNot applicable
Amazon EC2 Spot FleetNot applicable
Amazon EC2 APIApplicable
Amazon Elastic Container Service (ECS)Not applicable
Amazon ECS Container InsightsApplicable
Amazon Elastic File System (EFS)Not applicable
Amazon Elastic Kubernetes Service (EKS)Applicable
Amazon ElastiCache (EC)Not applicable
AWS Elastic BeanstalkApplicable
Amazon Elastic InferenceApplicable
Amazon Elastic TranscoderApplicable
Amazon Elastic Map Reduce (EMR)Not applicable
Amazon Elasticsearch Service (ES)Not applicable
Amazon EventBridgeApplicable
Amazon FSxApplicable
Amazon GameLiftApplicable
AWS GlueNot applicable
Amazon InspectorApplicable
AWS Internet of Things (IoT)Not applicable
AWS IoT Things GraphApplicable
AWS IoT AnalyticsApplicable
Amazon Managed Streaming for KafkaApplicable
Amazon Kinesis Data AnalyticsNot applicable
Amazon Data FirehoseNot applicable
Amazon Kinesis Data StreamsNot applicable
Amazon Kinesis Video StreamsNot applicable
AWS LambdaNot applicable
Amazon LexApplicable
Amazon CloudWatch LogsApplicable
AWS Elemental MediaTailorApplicable
AWS Elemental MediaConnectApplicable
AWS Elemental MediaConvertApplicable
AWS Elemental MediaPackage LiveApplicable
AWS Elemental MediaPackage Video on DemandApplicable
Amazon MQApplicable
Amazon VPC NAT GatewaysNot applicable
Amazon NeptuneApplicable
AWS OpsWorksApplicable
Amazon PollyApplicable
Amazon QLDBApplicable
Amazon RDSNot applicable
Amazon RedshiftNot applicable
Amazon RekognitionApplicable
AWS RoboMakerApplicable
Amazon Route 53Applicable
Amazon Route 53 ResolverApplicable
Amazon S3Not applicable
Amazon SageMaker Batch Transform JobsNot applicable
Amazon SageMaker EndpointsNot applicable
Amazon SageMaker Endpoint InstancesNot applicable
Amazon SageMaker Ground TruthNot applicable
Amazon SageMaker Processing JobsNot applicable
Amazon SageMaker Training JobsNot applicable
AWS Service CatalogApplicable
Amazon Simple Email Service (SES)Not applicable
Amazon Simple Notification Service (SNS)Not applicable
Amazon Simple Queue Service (SQS)Not applicable
AWS Systems Manager - Run CommandApplicable
AWS Step FunctionsApplicable
AWS Storage GatewayApplicable
Amazon SWFApplicable
Amazon TextractApplicable
AWS Transfer FamilyApplicable
AWS Transit GatewayApplicable
Amazon TranslateApplicable
AWS Trusted AdvisorApplicable
AWS API UsageApplicable
AWS Site-to-Site VPNApplicable
AWS WAF ClassicApplicable
AWS WAFApplicable
Amazon WorkMailApplicable
Amazon WorkSpacesApplicable

Available metrics

Amazon Kinesis Data Analytics

Application is the main dimension.

NameDescriptionUnitStatisticsDimensionsRecommended
BytesThe number of bytes read (per input stream) or written (per output stream).BytesSumApplication, Flow, IdApplicable
InputProcessing.DroppedRecordsThe number of records returned by a Lambda function that were marked with Dropped status.CountSumApplication, Flow, Id
InputProcessing.DurationThe time taken for each AWS Lambda function invocation performed by Kinesis Data Analytics.MillisecondsMultiApplication, Flow, Id
InputProcessing.OkBytesThe sum of bytes of the records returned by a Lambda function that were marked with OK status.BytesSumApplication, Flow, Id
InputProcessing.OkRecordsThe number of records returned by a Lambda function that were marked with OK status.CountSumApplication, Flow, Id
InputProcessing.ProcessingFailedRecordsThe number of records returned by a Lambda function that were marked with ProcessingFailed status.CountSumApplication, Flow, Id
InputProcessing.SuccessThe number of successful Lambda invocations by Kinesis Data Analytics.CountSumApplication, Flow, Id
KPUsThe number of Kinesis Processing Units that are used to run your stream processing application.CountCountApplication
KPUsCountMultiApplication
KPUsCountSumApplicationApplicable
LambdaDelivery.DeliveryFailedRecordsThe number of successful Lambda invocations by Kinesis Data Analytics.CountSumApplication, Flow, Id
LambdaDelivery.DurationThe time taken for each Lambda function invocation performed by Kinesis Data Analytics.MillisecondsMultiApplication, Flow, Id
LambdaDelivery.OkRecordsThe number of records returned by a Lambda function that were marked with OK status.CountSumApplication, Flow, Id
MillisBehindLatestIndicates how far behind from the current time an application is reading from the streaming source.MillisecondsMultiApplication; Application, Flow, Id
RecordsThe number of records read (per input stream) or written (per output stream).CountSumApplication, Flow, IdApplicable
SuccessThe number of successful deliveries. Every successful delivery attempt to the destination configured for your application is marked with 1. Every failed delivery attempt is marked with 0.CountAverageApplication, Flow, IdApplicable
backPressuredTimeMsPerSecondThe time (in milliseconds) this task or operator is back pressured per second.MillisecondsCountApplication
backPressuredTimeMsPerSecondMillisecondsMultiApplication
backPressuredTimeMsPerSecondMillisecondsSumApplication
busyTimeMsPerSecondThe time (in milliseconds) this task or operator is busy (neither idle nor back pressured) per second. Can be NaN, if the value could not be calculated.MillisecondsCountApplication
busyTimeMsPerSecondMillisecondsMultiApplication
busyTimeMsPerSecondMillisecondsSumApplication
bytes_consumed_rateThe average number of bytes consumed per second for a topic.BytesCountApplication
bytes_consumed_rateBytesMultiApplication
bytes_consumed_rateBytesSumApplication
commitsFailedThe total number of offset commit failures to Kafka, if offset committing and checkpointing are enabled.CountCountApplication
commitsFailedCountMultiApplication
commitsFailedCountSumApplication
commitsSucceededThe total number of successful offset commits to Kafka, if offset committing and checkpointing are enabled.CountCountApplication
commitsSucceededCountMultiApplication
commitsSucceededCountSumApplication
committedOffsetsThe last successfully committed offsets to Kafka, for each partition. A particular partition's metric can be specified by topic name and partition id.CountCountApplication
committedOffsetsCountMultiApplication
committedOffsetsCountSumApplication
containerCPUUtilizationOverall percentage of CPU utilization across task manager containers in Flink application cluster.PercentCountApplication
containerCPUUtilizationPercentMultiApplication
containerCPUUtilizationPercentSumApplication
containerDiskUtilizationOverall percentage of disk utilization across task manager containers in Flink application cluster.PercentCountApplication
containerDiskUtilizationPercentMultiApplication
containerDiskUtilizationPercentSumApplication
containerMemoryUtilizationOverall percentage of memory utilization across task manager containers in Flink application cluster.PercentCountApplication
containerMemoryUtilizationPercentMultiApplication
containerMemoryUtilizationPercentSumApplication
cpuUtilizationOverall percentage of CPU utilization across task managers.PercentCountApplication
cpuUtilizationPercentMultiApplication
cpuUtilizationPercentSumApplication
currentInputWatermarkThe last watermark this application/operator/task/thread has received.MillisecondsCountApplication
currentInputWatermarkMillisecondsMultiApplication
currentInputWatermarkMillisecondsSumApplication
currentOffsetsThe consumer's current read offset, for each partition. A particular partition's metric can be specified by topic name and partition id.CountCountApplication
currentOffsetsCountMultiApplication
currentOffsetsCountSumApplication
currentOutputWatermarkThe last watermark this application/operator/task/thread has emitted.MillisecondsCountApplication
currentOutputWatermarkMillisecondsMultiApplication
currentOutputWatermarkMillisecondsSumApplication
downtimeFor jobs currently in a failing/recovering situation, the time elapsed during this outage.MillisecondsCountApplication
downtimeMillisecondsMultiApplication
downtimeMillisecondsSumApplication
fullRestartsThe total number of times this job has fully restarted since it was submitted. This metric does not measure fine-grained restarts.CountCountApplication
fullRestartsCountMultiApplication
fullRestartsCountSumApplication
heapMemoryUtilizationOverall heap memory utilization across task managers.PercentCountApplication
heapMemoryUtilizationPercentMultiApplication
heapMemoryUtilizationPercentSumApplication
idleTimeMsPerSecondThe time (in milliseconds) this task or operator is idle (has no data to process) per second. Idle time excludes back pressured time, so if the task is back pressured it is not idle.MillisecondsCountApplication
idleTimeMsPerSecondMillisecondsMultiApplication
idleTimeMsPerSecondMillisecondsSumApplication
lastCheckpointDurationThe time it took to complete the last checkpoint.MillisecondsCountApplication
lastCheckpointDurationMillisecondsMultiApplication
lastCheckpointDurationMillisecondsSumApplication
lastCheckpointSizeThe total size of the last checkpointBytesCountApplication
lastCheckpointSizeBytesMultiApplication
lastCheckpointSizeBytesSumApplication
numRecordsInPerSecondThe total number of records this application, operator or task has received per second.Count/SecondCountApplication
numRecordsInPerSecondCount/SecondMultiApplication
numRecordsInPerSecondCount/SecondSumApplication
numRecordsInThe total number of records this application, operator, or task has received.CountCountApplication
numRecordsInCountMultiApplication
numRecordsInCountSumApplication
numRecordsOutPerSecondThe total number of records this application, operator or task has emitted per second.Count/SecondCountApplication
numRecordsOutPerSecondCount/SecondMultiApplication
numRecordsOutPerSecondCount/SecondSumApplication
numRecordsOutThe total number of records this application, operator or task has emitted.CountCountApplication
numRecordsOutCountMultiApplication
numRecordsOutCountSumApplication
numRestartsCountCountApplication
numRestartsCountMultiApplication
numRestartsCountSumApplication
numberOfFailedCheckpointsThe number of times checkpointing has failed.CountCountApplication
numberOfFailedCheckpointsCountMultiApplication
numberOfFailedCheckpointsCountSumApplication
oldGenerationGCCountThe total number of old garbage collection operations that have occurred across all task managers.CountCountApplication
oldGenerationGCCountCountMultiApplication
oldGenerationGCCountCountSumApplication
oldGenerationGCTimeThe total time spent performing old garbage collection operations.MillisecondsCountApplication
oldGenerationGCTimeMillisecondsMultiApplication
oldGenerationGCTimeMillisecondsSumApplication
processElementavgCountCountApplication, Service
processElementavgCountMultiApplication, Service
processElementavgCountSumApplication, Service
readDocsavgCountCountApplication, Service
readDocsavgCountMultiApplication, Service
readDocsavgCountSumApplication, Service
records_lag_maxThe maximum lag in terms of number of records for any partition in this windowCountCountApplication
records_lag_maxCountMultiApplication
records_lag_maxCountSumApplication
threadsCountCountCountApplication
threadsCountCountMultiApplication
threadsCountCountSumApplication
updatesavgCountCountApplication, Service
updatesavgCountMultiApplication, Service
updatesavgCountSumApplication, Service
uptimeThe time that the job has been running without interruption.MillisecondsCountApplication
uptimeMillisecondsMultiApplication
uptimeMillisecondsSumApplication

Amazon Data Firehose

DeliveryStreamName is the main dimension.

NameDescriptionUnitStatisticsDimensionsRecommended
BackupToS3.BytesThe number of bytes delivered to Amazon S3 for backup over the specified time period. Amazon Data Firehose emits this metric when backup to Amazon S3 is enabled.BytesSumRegion
BackupToS3.BytesBytesSumDeliveryStreamName
BackupToS3.DataFreshnessAge (from getting into Amazon Data Firehose to now) of the oldest record in Amazon Data Firehose. Any record older than this age has been delivered to the Amazon S3 bucket for backup. Amazon Data Firehose emits this metric when backup to Amazon S3 is enabled.SecondsMaximumRegion
BackupToS3.DataFreshnessSecondsMaximumDeliveryStreamName
BackupToS3.RecordsThe number of records delivered to Amazon S3 for backup over the specified time period. Amazon Data Firehose emits this metric when backup to Amazon S3 is enabled.CountSumRegion
BackupToS3.RecordsCountSumDeliveryStreamName
BackupToS3.SuccessSum of successful Amazon S3 put commands for backup over sum of all Amazon S3 backup put commands. Amazon Data Firehose emits this metric when backup to Amazon S3 is enabled.CountCountRegion
BackupToS3.SuccessCountCountDeliveryStreamName
DataReadFromKinesisStream.BytesWhen the data source is a Kinesis data stream, this metric indicates the number of bytes read from that data stream. This number includes rereads due to failovers.BytesSumRegion
DataReadFromKinesisStream.BytesBytesSumDeliveryStreamName
DataReadFromKinesisStream.RecordsWhen the data source is a Kinesis data stream, this metric indicates the number of records read from that data stream. This number includes rereads due to failovers.CountSumRegion
DataReadFromKinesisStream.RecordsCountSumDeliveryStreamName
DeliveryToElasticsearch.BytesThe number of bytes indexed to Amazon ES over the specified time periodBytesSumRegion
DeliveryToElasticsearch.BytesBytesSumDeliveryStreamName
DeliveryToElasticsearch.RecordsThe number of records indexed to Amazon ES over the specified time periodCountSumRegion
DeliveryToElasticsearch.RecordsCountSumDeliveryStreamName
DeliveryToElasticsearch.SuccessThe sum of the successfully indexed records over the sum of records that were attemptedCountCountRegion
DeliveryToElasticsearch.SuccessCountCountDeliveryStreamName
DeliveryToRedshift.BytesThe number of bytes copied to Amazon Redshift over the specified time periodBytesSumRegion
DeliveryToRedshift.BytesBytesSumDeliveryStreamName
DeliveryToRedshift.RecordsThe number of records copied to Amazon Redshift over the specified time periodCountSumRegion
DeliveryToRedshift.RecordsCountSumDeliveryStreamName
DeliveryToRedshift.SuccessThe sum of successful Amazon Redshift COPY commands over the sum of all Amazon Redshift COPY commandsCountCountRegion
DeliveryToRedshift.SuccessCountCountDeliveryStreamName
DeliveryToS3.BytesThe number of bytes delivered to Amazon S3 over the specified time periodBytesSumRegion
DeliveryToS3.BytesBytesSumDeliveryStreamName
DeliveryToS3.DataFreshnessThe age (from getting into Amazon Data Firehose to now) of the oldest record in Amazon Data Firehose. Any record older than this age has been delivered to the S3 bucket.SecondsMaximumRegion
DeliveryToS3.DataFreshnessSecondsMaximumDeliveryStreamName
DeliveryToS3.RecordsThe number of records delivered to Amazon S3 over the specified time periodCountSumRegion
DeliveryToS3.RecordsCountSumDeliveryStreamName
DeliveryToS3.SuccessThe sum of successful Amazon S3 put commands over the sum of all Amazon S3 put commandsCountCountRegion
DeliveryToS3.SuccessCountCountDeliveryStreamName
DeliveryToSplunk.BytesThe number of bytes delivered to Splunk over the specified time periodBytesSumRegion
DeliveryToSplunk.BytesBytesSumDeliveryStreamName
DeliveryToSplunk.DataAckLatencyThe approximate duration it takes to receive an acknowledgment from Splunk after Amazon Data Firehose sends it dataSecondsAverageRegion
DeliveryToSplunk.DataAckLatencySecondsAverageDeliveryStreamName
DeliveryToSplunk.DataFreshnessAge (from getting into Amazon Data Firehose to now) of the oldest record in Amazon Data Firehose. Any record older than this age has been delivered to Splunk.SecondsMaximumRegion
DeliveryToSplunk.DataFreshnessSecondsMaximumDeliveryStreamName
DeliveryToSplunk.RecordsThe number of records delivered to Splunk over the specified time periodCountSumRegion
DeliveryToSplunk.RecordsCountSumDeliveryStreamName
DeliveryToSplunk.SuccessThe sum of the successfully indexed records over the sum of records that were attemptedCountCountRegion
DeliveryToSplunk.SuccessCountCountDeliveryStreamName
DescribeDeliveryStream.LatencyThe time taken per DescribeDeliveryStream operation, measured over the specified time periodMillisecondsMultiRegion
DescribeDeliveryStream.LatencyMillisecondsMultiDeliveryStreamName
DescribeDeliveryStream.RequestsThe total number of DescribeDeliveryStream requestsCountSumRegion
DescribeDeliveryStream.RequestsCountSumDeliveryStreamName
ExecuteProcessing.DurationThe time it takes for each Lambda function invocation performed by Amazon Data FirehoseMillisecondsMultiRegion
ExecuteProcessing.DurationMillisecondsMultiDeliveryStreamName
ExecuteProcessing.SuccessThe sum of the successful Lambda function invocations over the sum of the total Lambda function invocationsCountCountRegion
ExecuteProcessing.SuccessCountCountDeliveryStreamName
FailedConversion.BytesThe size of the records that could not be convertedBytesSumRegion
FailedConversion.BytesBytesSumDeliveryStreamName
FailedConversion.RecordsThe number of records that could not be convertedCountSumRegion
FailedConversion.RecordsCountSumDeliveryStreamName
IncomingBytesThe number of bytes ingested successfully into the delivery stream over the specified time period after throttlingBytesSumRegion
IncomingBytesBytesSumDeliveryStreamNameApplicable
IncomingRecordsThe number of records ingested successfully into the delivery stream over the specified time period after throttlingCountSumRegion
IncomingRecordsCountSumDeliveryStreamNameApplicable
KinesisMillisBehindLatestWhen the data source is a Kinesis data stream, this metric indicates the number of milliseconds that the last read record is behind the newest record in the Kinesis data streamMillisecondsAverageRegion
KinesisMillisBehindLatestMillisecondsAverageDeliveryStreamName
ListDeliveryStreams.LatencyThe time taken per ListDeliveryStream operation, measured over the specified time periodMillisecondsMultiRegion
ListDeliveryStreams.LatencyMillisecondsMultiDeliveryStreamName
ListDeliveryStreams.RequestsThe total number of ListFirehose requestsCountSumRegion
ListDeliveryStreams.RequestsCountSumDeliveryStreamName
PutRecordBatch.BytesThe number of bytes put to the Amazon Data Firehose delivery stream using PutRecordBatch over the specified time periodBytesSumRegion
PutRecordBatch.BytesBytesSumDeliveryStreamName
PutRecordBatch.LatencyThe time taken per PutRecordBatch operation, measured over the specified time periodMillisecondsMultiRegion
PutRecordBatch.LatencyMillisecondsMultiDeliveryStreamName
PutRecordBatch.RecordsThe total number of records from PutRecordBatch operationsCountSumRegion
PutRecordBatch.RecordsCountSumDeliveryStreamName
PutRecordBatch.RequestsThe total number of PutRecordBatch requestsCountSumRegion
PutRecordBatch.RequestsCountSumDeliveryStreamName
PutRecord.BytesThe number of bytes put to the Amazon Data Firehose delivery stream using PutRecord over the specified time periodBytesSumRegion
PutRecord.BytesBytesSumDeliveryStreamName
PutRecord.LatencyThe time taken per PutRecord operation, measured over the specified time periodMillisecondsMultiRegion
PutRecord.LatencyMillisecondsMultiDeliveryStreamName
PutRecord.RequestsThe total number of PutRecord requests, which is equal to the total number of records from PutRecord operationsCountSumRegion
PutRecord.RequestsCountSumDeliveryStreamName
SucceedConversion.BytesThe size of the successfully converted recordsBytesSumRegion
SucceedConversion.BytesBytesSumDeliveryStreamName
SucceedConversion.RecordsThe number of successfully converted recordsCountSumRegion
SucceedConversion.RecordsCountSumDeliveryStreamName
SucceedProcessing.BytesThe number of successfully processed bytes over the specified time periodBytesSumRegion
SucceedProcessing.BytesBytesSumDeliveryStreamName
SucceedProcessing.RecordsThe number of successfully processed records over the specified time periodCountSumRegion
SucceedProcessing.RecordsCountSumDeliveryStreamName
ThrottledDescribeStreamThe total number of times the DescribeStream operation is throttled when the data source is a Kinesis data streamCountAverageRegion
ThrottledDescribeStreamCountAverageDeliveryStreamName
ThrottledGetRecordsThe total number of times the GetRecords operation is throttled when the data source is a Kinesis data streamCountAverageRegion
ThrottledGetRecordsCountAverageDeliveryStreamName
ThrottledGetShardIteratorThe total number of times the GetShardIterator operation is throttled when the data source is a Kinesis data streamCountAverageRegion
ThrottledGetShardIteratorCountAverageDeliveryStreamName
UpdateDeliveryStream.LatencyThe time taken per UpdateDeliveryStream operation, measured over the specified time periodMillisecondsMultiRegion
UpdateDeliveryStream.LatencyMillisecondsMultiDeliveryStreamName
UpdateDeliveryStream.RequestsThe total number of UpdateDeliveryStream requestsCountSumRegion
UpdateDeliveryStream.RequestsCountSumDeliveryStreamName

Amazon Kinesis Data Streams (KDS)

StreamName is the main dimension.

NameDescriptionUnitStatisticsDimensionsRecommended
GetRecords.BytesThe number of bytes retrieved from the Kinesis stream, measured over the specified time period. Minimum, maximum, and average statistics represent the bytes in a single GetRecords operation for the stream in the specified time period.BytesSumStreamName
GetRecords.BytesBytesMultiStreamName
GetRecords.BytesBytesCountStreamName
GetRecords.IteratorAgeMillisecondsThe age of the last record in all GetRecords calls made against a Kinesis stream, measured over the specified time period. Age is the difference between the current time and when the last record of the GetRecords call was written to the stream. The minimum and maximum statistics can be used to track the progress of Kinesis consumer applications. A value of 0 indicates that the records being read are completely caught up with the stream.MillisecondsMultiStreamNameApplicable
GetRecords.IteratorAgeMillisecondsMillisecondsCountStreamName
GetRecords.LatencyThe time taken per GetRecords operation, measured over the specified time periodMillisecondsMultiStreamName
GetRecords.RecordsThe number of records retrieved from the shard, measured over the specified time period. Minimum, maximum, and average statistics represent the records in a single GetRecords operation for the stream in the specified time period.CountSumStreamName
GetRecords.RecordsCountMultiStreamName
GetRecords.RecordsCountCountStreamName
GetRecords.SuccessThe number of successful GetRecords operations per stream, measured over the specified time periodCountSumStreamName
GetRecords.SuccessCountAverageStreamNameApplicable
GetRecords.SuccessCountCountStreamName
IncomingBytesThe number of bytes successfully put to the Kinesis stream over the specified time period. This metric includes bytes from PutRecord and PutRecords operations. Minimum, maximum, and average statistics represent the bytes in a single put operation for the stream in the specified time period.BytesCountShardId, StreamName
IncomingBytesBytesCountStreamName
IncomingBytesBytesMultiShardId, StreamName
IncomingBytesBytesMultiStreamName
IncomingBytesBytesSumShardId, StreamName
IncomingBytesBytesSumStreamName
IncomingRecordsThe number of records successfully put to the Kinesis stream over the specified time period. This metric includes record counts from PutRecord and PutRecords operations. Minimum, maximum, and average statistics represent the records in a single put operation for the stream in the specified time period.CountCountShardId, StreamName
IncomingRecordsCountCountStreamName
IncomingRecordsCountMultiShardId, StreamName
IncomingRecordsCountMultiStreamName
IncomingRecordsCountSumShardId, StreamName
IteratorAgeMillisecondsThe age of the last record in all GetRecords calls made against a shard, measured over the specified time period. Age is the difference between the current time and when the last record of the GetRecords call was written to the stream. The minimum and maximum statistics can be used to track the progress of Kinesis consumer applications. A value of 0 indicates that the records being read are completely caught up with the stream.MillisecondsMultiStreamName, ShardId
IteratorAgeMillisecondsMillisecondsCountStreamName, ShardId
OutgoingBytesThe number of bytes retrieved from the shard, measured over the specified time period. Minimum, maximum, and average statistics represent the bytes returned in a single GetRecords operation or published in a single SubscribeToShard event for the shard in the specified time period.BytesSumStreamName, ShardId
OutgoingBytesBytesMultiStreamName, ShardId
OutgoingBytesBytesCountStreamName, ShardId
OutgoingRecordsThe number of records retrieved from the shard, measured over the specified time period. Minimum, maximum, and average statistics represent the records returned in a single GetRecords operation or published in a single SubscribeToShard event for the shard in the specified time period.CountSumStreamName, ShardId
OutgoingRecordsCountMultiStreamName, ShardId
OutgoingRecordsCountCountStreamName, ShardId
PutRecord.BytesThe number of bytes put to the Kinesis stream using the PutRecord operation over the specified time periodBytesSumStreamName
PutRecord.BytesBytesMultiStreamName
PutRecord.BytesBytesCountStreamName
PutRecord.LatencyThe time taken per PutRecord operation, measured over the specified time periodMillisecondsMultiStreamName
PutRecord.SuccessThe number of successful PutRecord operations per Kinesis stream, measured over the specified time period. Average reflects the percentage of successful writes to a stream.CountSumStreamName
PutRecord.SuccessCountAverageStreamNameApplicable
PutRecord.SuccessCountCountStreamName
PutRecords.BytesThe number of bytes put to the Kinesis stream using the PutRecords operation over the specified time periodBytesSumStreamName
PutRecords.BytesBytesMultiStreamName
PutRecords.BytesBytesCountStreamName
PutRecords.LatencyThe time taken per PutRecords operation, measured over the specified time periodMillisecondsMultiStreamName
PutRecords.RecordsThe number of successful records in a PutRecords operation per Kinesis stream, measured over the specified time periodCountSumStreamName
PutRecords.RecordsCountMultiStreamName
PutRecords.RecordsCountCountStreamName
PutRecords.SuccessThe number of PutRecords operations where at least one record succeeded, per Kinesis stream, measured over the specified time periodCountSumStreamName
PutRecords.SuccessCountAverageStreamName
PutRecords.SuccessCountCountStreamName
ReadProvisionedThroughputExceededThe number of GetRecords calls throttled for the stream over the specified time periodCountSumStreamName
ReadProvisionedThroughputExceededCountMultiStreamNameApplicable
ReadProvisionedThroughputExceededCountCountStreamName
SubscribeToShardEvent.BytesThe number of bytes received from the shard, measured over the specified time period. Minimum, maximum, and average statistics represent the bytes published in a single event for the specified time period.BytesSumStreamName, ConsumerName
SubscribeToShardEvent.BytesBytesMultiStreamName, ConsumerName
SubscribeToShardEvent.BytesBytesCountStreamName, ConsumerName
SubscribeToShardEvent.MillisBehindLatestThe difference between the current time and when the last record of the SubscribeToShard event was written to the streamMillisecondsMultiStreamName, ConsumerName
SubscribeToShardEvent.MillisBehindLatestMillisecondsCountStreamName, ConsumerName
SubscribeToShardEvent.RecordsThe number of records received from the shard, measured over the specified time period. Minimum, maximum, and average statistics represent the records in a single event for the specified time period.CountSumStreamName, ConsumerName
SubscribeToShardEvent.RecordsCountMultiStreamName, ConsumerName
SubscribeToShardEvent.RecordsCountCountStreamName, ConsumerName
SubscribeToShardEvent.SuccessThis metric is emitted every time an event is published successfully. Only emitted when there's an active subscription.CountSumStreamName, ConsumerName
SubscribeToShardEvent.SuccessCountMultiStreamName, ConsumerName
SubscribeToShardEvent.SuccessCountCountStreamName, ConsumerName
SubscribeToShard.RateExceededThis metric is emitted when a new subscription attempt fails because there already is an active subscription by the same consumer or if you exceed the number of calls per second allowed for this operationCountMinimumStreamName, ConsumerName
SubscribeToShard.SuccessCountMinimumStreamName, ConsumerName
WriteProvisionedThroughputExceededThe number of records rejected due to throttling for the stream over the specified time period. This metric includes throttling from PutRecord and PutRecords operations.CountSumStreamName
WriteProvisionedThroughputExceededCountMultiStreamNameApplicable
WriteProvisionedThroughputExceededCountCountStreamName

Amazon Kinesis Video Streams

StreamName is the main dimension.

NameDescriptionUnitStatisticsDimensionsRecommended
GetHLSMasterPlaylist.LatencyLatency of the GetHLSMasterPlaylist API calls for the given stream nameMillisecondsMultiStreamName
GetHLSMasterPlaylist.RequestsNumber of GetHLSMasterPlaylist API requests for a given streamCountSumStreamName
GetHLSMasterPlaylist.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountAverageStreamName
GetHLSMediaPlaylist.LatencyLatency of the GetHLSMediaPlaylist API calls for the given stream nameMillisecondsMultiStreamName
GetHLSMediaPlaylist.RequestsNumber of GetHLSMediaPlaylist API requests for a given streamCountSumStreamName
GetHLSMediaPlaylist.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountAverageStreamName
GetHLSStreamingSessionURL.LatencyLatency of the GetHLSStreamingSessionURL API calls for the given stream nameMillisecondsMultiStreamName
GetHLSStreamingSessionURL.RequestsNumber of GetHLSStreamingSessionURL API requests for a given streamCountSumStreamName
GetHLSStreamingSessionURL.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountAverageStreamName
GetMP4InitFragment.LatencyLatency of the GetMP4InitFragment API calls for the given stream nameMillisecondsMultiStreamName
GetMP4InitFragment.RequestsNumber of GetMP4InitFragment API requests for a given streamCountSumStreamName
GetMP4InitFragment.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountAverageStreamName
GetMP4MediaFragment.LatencyLatency of the GetMP4MediaFragment API calls for the given stream nameMillisecondsMultiStreamName
GetMP4MediaFragment.OutgoingBytesTotal number of bytes sent out from the service as part of the GetMP4MediaFragment API for a given streamBytesSumStreamName
GetMP4MediaFragment.RequestsNumber of GetMP4MediaFragment API requests for a given streamCountSumStreamName
GetMP4MediaFragment.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountSumStreamName
GetMedia.ConnectionErrorsThe number of connections that were not successfully establishedCountSumStreamNameApplicable
GetMediaForFragmentList.OutgoingBytesTotal number of bytes sent out from the service as part of the GetMediaForFragmentList API for a given streamBytesSumStreamName
GetMediaForFragmentList.OutgoingFragmentsTotal number of fragments sent out from the service as part of the GetMediaForFragmentList API for a given streamCountSumStreamName
GetMediaForFragmentList.OutgoingFramesTotal number of frames sent out from the service as part of the GetMediaForFragmentList API for a given streamCountSumStreamName
GetMediaForFragmentList.RequestsNumber of GetMediaForFragmentList API requests for a given streamCountSumStreamName
GetMediaForFragmentList.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountAverageStreamName
GetMedia.MillisBehindNowTime difference between the current server timestamp and the server timestamp of the last fragment sentMillisecondsMultiStreamName
GetMedia.OutgoingBytesTotal number of bytes sent out from the service as part of the GetMedia API for a given streamBytesSumStreamName
GetMedia.OutgoingFragmentsNumber of fragments sent while doing GetMedia for the streamCountSumStreamName
GetMedia.OutgoingFramesNumber of frames sent during GetMedia on the given streamCountSumStreamName
GetMedia.RequestsNumber of GetMedia API requests for a given streamCountSumStreamName
GetMedia.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountAverageStreamNameApplicable
ListFragments.LatencyLatency of the ListFragments API calls for the given stream nameMillisecondsMultiStreamNameApplicable
PutMedia.ActiveConnectionsThe total number of connections to the service hostCountSumStreamName
PutMedia.BufferingAckLatencyTime difference between when the first byte of a new fragment is received by Kinesis Video Streams and when the Buffering ACK is sent for the fragmentMillisecondsMultiStreamName
PutMedia.ConnectionErrorsErrors while establishing PutMedia connection for the streamCountSumStreamNameApplicable
PutMedia.ErrorAckCountNumber of Error ACKs sent while doing PutMedia for the streamCountSumStreamName
PutMedia.FragmentIngestionLatencyTime difference between when the first and last bytes of a fragment are received by Kinesis Video StreamsMillisecondsMultiStreamName
PutMedia.FragmentPersistLatencyTime taken from when the complete fragment data is received and archivedMillisecondsMultiStreamName
PutMedia.IncomingBytesNumber of bytes received as part of PutMedia for the streamBytesSumStreamName
PutMedia.IncomingFragmentsNumber of complete fragments received as part of PutMedia for the streamCountSumStreamName
PutMedia.IncomingFramesNumber of complete frames received as part of PutMedia for the streamCountSumStreamName
PutMedia.LatencyTime difference between the request and the HTTP response from InletService while establishing the connectionMillisecondsMultiStreamName
PutMedia.PersistedAckLatencyTime difference between when the last byte of a new fragment is received by Kinesis Video Streams and when the Persisted ACK is sent for the fragmentMillisecondsMultiStreamName
PutMedia.ReceivedAckLatencyTime difference between when the last byte of a new fragment is received by Kinesis Video Streams and when the Received ACK is sent for the fragmentMillisecondsMultiStreamName
PutMedia.RequestsNumber of PutMedia API requests for a given streamCountSumStreamName
PutMedia.SuccessThe rate of success, 1 being the value for every successful request, and 0 the value for every failureCountAverageStreamNameApplicable
Related tags
Infrastructure Observability