AWS Step Functions monitoring

  • How-to guide
  • 5-min read
  • Published Jul 06, 2020

Dynatrace ingests metrics for multiple preselected namespaces, including AWS Step Functions. You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.

Prerequisites

To enable monitoring for this service, you need

  • ActiveGate version 1.197+
  • For Dynatrace SaaS deployments, you need an Environment ActiveGate or a Multi-environment ActiveGate.

    For role-based access in SaaS deployment, you need an Environment ActiveGate installed on an Amazon EC2 host.

To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"acm-pca:ListCertificateAuthorities",
"apigateway:GET",
"apprunner:ListServices",
"appstream:DescribeFleets",
"appsync:ListGraphqlApis",
"athena:ListWorkGroups",
"autoscaling:DescribeAutoScalingGroups",
"cloudformation:ListStackResources",
"cloudfront:ListDistributions",
"cloudhsm:DescribeClusters",
"cloudsearch:DescribeDomains",
"cloudwatch:GetMetricData",
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics",
"codebuild:ListProjects",
"datasync:ListTasks",
"dax:DescribeClusters",
"directconnect:DescribeConnections",
"dms:DescribeReplicationInstances",
"dynamodb:ListTables",
"dynamodb:ListTagsOfResource",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInstances",
"ec2:DescribeNatGateways",
"ec2:DescribeSpotFleetRequests",
"ec2:DescribeTransitGateways",
"ec2:DescribeVolumes",
"ec2:DescribeVpnConnections",
"ecs:ListClusters",
"eks:ListClusters",
"elasticache:DescribeCacheClusters",
"elasticbeanstalk:DescribeEnvironmentResources",
"elasticbeanstalk:DescribeEnvironments",
"elasticfilesystem:DescribeFileSystems",
"elasticloadbalancing:DescribeInstanceHealth",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:DescribeRules",
"elasticloadbalancing:DescribeTags",
"elasticloadbalancing:DescribeTargetHealth",
"elasticmapreduce:ListClusters",
"elastictranscoder:ListPipelines",
"es:ListDomainNames",
"events:ListEventBuses",
"firehose:ListDeliveryStreams",
"fsx:DescribeFileSystems",
"gamelift:ListFleets",
"glue:GetJobs",
"inspector:ListAssessmentTemplates",
"kafka:ListClusters",
"kinesis:ListStreams",
"kinesisanalytics:ListApplications",
"kinesisvideo:ListStreams",
"lambda:ListFunctions",
"lambda:ListTags",
"lex:GetBots",
"logs:DescribeLogGroups",
"mediaconnect:ListFlows",
"mediaconvert:DescribeEndpoints",
"mediapackage-vod:ListPackagingConfigurations",
"mediapackage:ListChannels",
"mediatailor:ListPlaybackConfigurations",
"opsworks:DescribeStacks",
"qldb:ListLedgers",
"rds:DescribeDBClusters",
"rds:DescribeDBInstances",
"rds:DescribeEvents",
"rds:ListTagsForResource",
"redshift:DescribeClusters",
"robomaker:ListSimulationJobs",
"route53:ListHostedZones",
"route53resolver:ListResolverEndpoints",
"s3:ListAllMyBuckets",
"sagemaker:ListEndpoints",
"sns:ListTopics",
"sqs:ListQueues",
"storagegateway:ListGateways",
"sts:GetCallerIdentity",
"swf:ListDomains",
"tag:GetResources",
"tag:GetTagKeys",
"transfer:ListServers",
"workmail:ListOrganizations",
"workspaces:DescribeWorkspaces"
],
"Resource": "*"
}
]
}

If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for All AWS cloud services and, for each supporting service, a list of optional permissions specific to that service.

Permissions required for AWS monitoring integration:
  • "cloudwatch:GetMetricData"
  • "cloudwatch:GetMetricStatistics"
  • "cloudwatch:ListMetrics"
  • "sts:GetCallerIdentity"
  • "tag:GetResources"
  • "tag:GetTagKeys"
  • "ec2:DescribeAvailabilityZones"
NamePermissions
All monitored Amazon services requiredcloudwatch:GetMetricData,
cloudwatch:GetMetricStatistics,
cloudwatch:ListMetrics,
sts:GetCallerIdentity,
tag:GetResources,
tag:GetTagKeys,
ec2:DescribeAvailabilityZones
AWS Certificate Manager Private Certificate Authorityacm-pca:ListCertificateAuthorities
Amazon MQ
Amazon API Gatewayapigateway:GET
AWS App Runnerapprunner:ListServices
Amazon AppStreamappstream:DescribeFleets
AWS AppSyncappsync:ListGraphqlApis
Amazon Athenaathena:ListWorkGroups
Amazon Aurorards:DescribeDBClusters
Amazon EC2 Auto Scalingautoscaling:DescribeAutoScalingGroups
Amazon EC2 Auto Scaling (built-in)autoscaling:DescribeAutoScalingGroups
AWS Billing
Amazon Keyspaces
AWS Chatbot
Amazon CloudFrontcloudfront:ListDistributions
AWS CloudHSMcloudhsm:DescribeClusters
Amazon CloudSearchcloudsearch:DescribeDomains
AWS CodeBuildcodebuild:ListProjects
Amazon Cognito
Amazon Connect
Amazon Elastic Kubernetes Service (EKS)eks:ListClusters
AWS DataSyncdatasync:ListTasks
Amazon DynamoDB Accelerator (DAX)dax:DescribeClusters
AWS Database Migration Service (AWS DMS)dms:DescribeReplicationInstances
Amazon DocumentDBrds:DescribeDBClusters
AWS Direct Connectdirectconnect:DescribeConnections
Amazon DynamoDBdynamodb:ListTables
Amazon DynamoDB (built-in)dynamodb:ListTables,
dynamodb:ListTagsOfResource
Amazon EBSec2:DescribeVolumes
Amazon EBS (built-in)ec2:DescribeVolumes
Amazon EC2 API
Amazon EC2 (built-in)ec2:DescribeInstances
Amazon EC2 Spot Fleetec2:DescribeSpotFleetRequests
Amazon Elastic Container Service (ECS)ecs:ListClusters
Amazon ECS Container Insightsecs:ListClusters
Amazon ElastiCache (EC)elasticache:DescribeCacheClusters
AWS Elastic Beanstalkelasticbeanstalk:DescribeEnvironments
Amazon Elastic File System (EFS)elasticfilesystem:DescribeFileSystems
Amazon Elastic Inference
Amazon Elastic Map Reduce (EMR)elasticmapreduce:ListClusters
Amazon Elasticsearch Service (ES)es:ListDomainNames
Amazon Elastic Transcoderelastictranscoder:ListPipelines
Amazon Elastic Load Balancer (ELB) (built-in)elasticloadbalancing:DescribeInstanceHealth,
elasticloadbalancing:DescribeListeners,
elasticloadbalancing:DescribeLoadBalancers,
elasticloadbalancing:DescribeRules,
elasticloadbalancing:DescribeTags,
elasticloadbalancing:DescribeTargetHealth
Amazon EventBridgeevents:ListEventBuses
Amazon FSxfsx:DescribeFileSystems
Amazon GameLiftgamelift:ListFleets
AWS Glueglue:GetJobs
Amazon Inspectorinspector:ListAssessmentTemplates
AWS Internet of Things (IoT)
AWS IoT Analytics
Amazon Managed Streaming for Kafkakafka:ListClusters
Amazon Kinesis Data Analyticskinesisanalytics:ListApplications
Amazon Data Firehosefirehose:ListDeliveryStreams
Amazon Kinesis Data Streamskinesis:ListStreams
Amazon Kinesis Video Streamskinesisvideo:ListStreams
AWS Lambdalambda:ListFunctions
AWS Lambda (built-in)lambda:ListFunctions,
lambda:ListTags
Amazon Lexlex:GetBots
Amazon Application and Network Load Balancer (built-in)elasticloadbalancing:DescribeInstanceHealth,
elasticloadbalancing:DescribeListeners,
elasticloadbalancing:DescribeLoadBalancers,
elasticloadbalancing:DescribeRules,
elasticloadbalancing:DescribeTags,
elasticloadbalancing:DescribeTargetHealth
Amazon CloudWatch Logslogs:DescribeLogGroups
AWS Elemental MediaConnectmediaconnect:ListFlows
AWS Elemental MediaConvertmediaconvert:DescribeEndpoints
AWS Elemental MediaPackage Livemediapackage:ListChannels
AWS Elemental MediaPackage Video on Demandmediapackage-vod:ListPackagingConfigurations
AWS Elemental MediaTailormediatailor:ListPlaybackConfigurations
Amazon VPC NAT Gatewaysec2:DescribeNatGateways
Amazon Neptunerds:DescribeDBClusters
AWS OpsWorksopsworks:DescribeStacks
Amazon Polly
Amazon QLDBqldb:ListLedgers
Amazon RDSrds:DescribeDBInstances
Amazon RDS (built-in)rds:DescribeDBInstances,
rds:DescribeEvents,
rds:ListTagsForResource
Amazon Redshiftredshift:DescribeClusters
Amazon Rekognition
AWS RoboMakerrobomaker:ListSimulationJobs
Amazon Route 53route53:ListHostedZones
Amazon Route 53 Resolverroute53resolver:ListResolverEndpoints
Amazon S3s3:ListAllMyBuckets
Amazon S3 (built-in)s3:ListAllMyBuckets
Amazon SageMaker Batch Transform Jobs
Amazon SageMaker Endpoint Instancessagemaker:ListEndpoints
Amazon SageMaker Endpointssagemaker:ListEndpoints
Amazon SageMaker Ground Truth
Amazon SageMaker Processing Jobs
Amazon SageMaker Training Jobs
AWS Service Catalog
Amazon Simple Email Service (SES)
Amazon Simple Notification Service (SNS)sns:ListTopics
Amazon Simple Queue Service (SQS)sqs:ListQueues
AWS Systems Manager - Run Command
AWS Step Functions
AWS Storage Gatewaystoragegateway:ListGateways
Amazon SWFswf:ListDomains
Amazon Textract
AWS IoT Things Graph
AWS Transfer Familytransfer:ListServers
AWS Transit Gatewayec2:DescribeTransitGateways
Amazon Translate
AWS Trusted Advisor
AWS API Usage
AWS Site-to-Site VPNec2:DescribeVpnConnections
AWS WAF Classic
AWS WAF
Amazon WorkMailworkmail:ListOrganizations
Amazon WorkSpacesworkspaces:DescribeWorkspaces

Example of JSON policy for one single service.

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"apigateway:GET",
"cloudwatch:GetMetricData",
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics",
"sts:GetCallerIdentity",
"tag:GetResources",
"tag:GetTagKeys",
"ec2:DescribeAvailabilityZones"
],
"Resource": "*"
}
]
}

In this example, from the complete list of permissions you need to select

  • "apigateway:GET" for Amazon API Gateway
  • "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics", "sts:GetCallerIdentity", "tag:GetResources", "tag:GetTagKeys", and "ec2:DescribeAvailabilityZones" for All AWS cloud services.

Enable monitoring

To learn how to enable service monitoring, see Enable service monitoring.

View service metrics

You can view the service metrics in your Dynatrace environment either on the custom device overview page or on your Dashboards page.

View metrics on the custom device overview page

To access the custom device overview page

  1. Go to Technologies & Processes Classic.
  2. Filter by service name and select the relevant custom device group.
  3. Once you select the custom device group, you're on the custom device group overview page.
  4. The custom device group overview page lists all instances (custom devices) belonging to the group. Select an instance to view the custom device overview page.

View metrics on your dashboard

After you add the service to monitoring, a preset dashboard containing all recommended metrics is automatically listed on your Dashboards page. To look for specific dashboards, filter by Preset and then by Name.

AWS presets

For existing monitored services, you might need to resave your credentials for the preset dashboard to appear on the Dashboards page. To resave your credentials, go to Settings > Cloud and virtualization > AWS, select the desired AWS instance, and then select Save.

You can't make changes on a preset dashboard directly, but you can clone and edit it. To clone a dashboard, open the browse menu () and select Clone.

To remove a dashboard from the dashboards page, you can hide it. To hide a dashboard, open the browse menu () and select Hide.

Hiding a dashboard doesn't affect other users.

Clone hide AWS

To check the availability of preset dashboards for each AWS service, see the list below.

AWS servicePreset dashboard
Amazon EC2 Auto Scaling (built-in)Not applicable
AWS Lambda (built-in)Not applicable
Amazon Application and Network Load Balancer (built-in)Not applicable
Amazon DynamoDB (built-in)Not applicable
Amazon EBS (built-in)Not applicable
Amazon EC2 (built-in)Not applicable
Amazon Elastic Load Balancer (ELB) (built-in)Not applicable
Amazon RDS (built-in)Not applicable
Amazon S3 (built-in)Not applicable
AWS Certificate Manager Private Certificate AuthorityNot applicable
All monitored Amazon servicesNot applicable
Amazon API GatewayNot applicable
AWS App RunnerNot applicable
Amazon AppStreamApplicable
AWS AppSyncApplicable
Amazon AthenaApplicable
Amazon AuroraNot applicable
Amazon EC2 Auto ScalingApplicable
AWS BillingApplicable
Amazon KeyspacesApplicable
AWS ChatbotApplicable
Amazon CloudFrontNot applicable
AWS CloudHSMApplicable
Amazon CloudSearchApplicable
AWS CodeBuildApplicable
Amazon CognitoNot applicable
Amazon ConnectApplicable
AWS DataSyncApplicable
Amazon DynamoDB Accelerator (DAX)Applicable
AWS Database Migration Service (AWS DMS)Applicable
Amazon DocumentDBApplicable
AWS Direct ConnectApplicable
Amazon DynamoDBNot applicable
Amazon EBSNot applicable
Amazon EC2 Spot FleetNot applicable
Amazon EC2 APIApplicable
Amazon Elastic Container Service (ECS)Not applicable
Amazon ECS Container InsightsApplicable
Amazon Elastic File System (EFS)Not applicable
Amazon Elastic Kubernetes Service (EKS)Applicable
Amazon ElastiCache (EC)Not applicable
AWS Elastic BeanstalkApplicable
Amazon Elastic InferenceApplicable
Amazon Elastic TranscoderApplicable
Amazon Elastic Map Reduce (EMR)Not applicable
Amazon Elasticsearch Service (ES)Not applicable
Amazon EventBridgeApplicable
Amazon FSxApplicable
Amazon GameLiftApplicable
AWS GlueNot applicable
Amazon InspectorApplicable
AWS Internet of Things (IoT)Not applicable
AWS IoT Things GraphApplicable
AWS IoT AnalyticsApplicable
Amazon Managed Streaming for KafkaApplicable
Amazon Kinesis Data AnalyticsNot applicable
Amazon Data FirehoseNot applicable
Amazon Kinesis Data StreamsNot applicable
Amazon Kinesis Video StreamsNot applicable
AWS LambdaNot applicable
Amazon LexApplicable
Amazon CloudWatch LogsApplicable
AWS Elemental MediaTailorApplicable
AWS Elemental MediaConnectApplicable
AWS Elemental MediaConvertApplicable
AWS Elemental MediaPackage LiveApplicable
AWS Elemental MediaPackage Video on DemandApplicable
Amazon MQApplicable
Amazon VPC NAT GatewaysNot applicable
Amazon NeptuneApplicable
AWS OpsWorksApplicable
Amazon PollyApplicable
Amazon QLDBApplicable
Amazon RDSNot applicable
Amazon RedshiftNot applicable
Amazon RekognitionApplicable
AWS RoboMakerApplicable
Amazon Route 53Applicable
Amazon Route 53 ResolverApplicable
Amazon S3Not applicable
Amazon SageMaker Batch Transform JobsNot applicable
Amazon SageMaker EndpointsNot applicable
Amazon SageMaker Endpoint InstancesNot applicable
Amazon SageMaker Ground TruthNot applicable
Amazon SageMaker Processing JobsNot applicable
Amazon SageMaker Training JobsNot applicable
AWS Service CatalogApplicable
Amazon Simple Email Service (SES)Not applicable
Amazon Simple Notification Service (SNS)Not applicable
Amazon Simple Queue Service (SQS)Not applicable
AWS Systems Manager - Run CommandApplicable
AWS Step FunctionsApplicable
AWS Storage GatewayApplicable
Amazon SWFApplicable
Amazon TextractApplicable
AWS Transfer FamilyApplicable
AWS Transit GatewayApplicable
Amazon TranslateApplicable
AWS Trusted AdvisorApplicable
AWS API UsageApplicable
AWS Site-to-Site VPNApplicable
AWS WAF ClassicApplicable
AWS WAFApplicable
Amazon WorkMailApplicable
Amazon WorkSpacesApplicable

Step

Available metrics

NameDescriptionUnitStatisticsDimensionsRecommended
ActivitiesFailedThe number of failed activitiesCountSumRegion, ActivityArnApplicable
ActivitiesHeartbeatTimedOutThe number of activities that time out due to a heartbeat timeoutCountSumRegion, ActivityArnApplicable
ActivitiesScheduledThe number of scheduled activitiesCountSumRegion, ActivityArnApplicable
ActivitiesStartedThe number of started activitiesCountSumRegion, ActivityArn
ActivitiesSucceededThe number of successfully completed activitiesCountSumRegion, ActivityArnApplicable
ActivitiesTimedOutThe number of activities that time out on closeCountSumRegion, ActivityArnApplicable
ActivityRunTimeThe interval, in milliseconds, between the time the activity starts and the time it closesMillisecondsMultiRegion, ActivityArnApplicable
ActivityScheduleTimeThe interval, in milliseconds, for which the activity stays in the schedule stateMillisecondsMultiRegion, ActivityArn
ActivityTimeThe interval, in milliseconds, between the time the activity is scheduled and the time it closesMillisecondsMultiRegion, ActivityArn
ConsumedCapacityThe count of requests per secondCountSumRegion, ServiceMetricApplicable
ConsumedCapacityCountSumRegion, APINameApplicable
ExecutionThrottledThe number of StateEntered events and retries that have been throttledCountSumRegion, StateMachineArnApplicable
ExecutionTimeThe interval, in milliseconds, between the time the execution starts and the time it closesMillisecondsMultiRegion, StateMachineArnApplicable
ExecutionsAbortedThe number of aborted or terminated executionsCountSumRegion, StateMachineArnApplicable
ExecutionsFailedThe number of failed executionsCountSumRegion, StateMachineArnApplicable
ExecutionsStartedThe number of started executionsCountSumRegion, StateMachineArnApplicable
ExecutionsSucceededThe number of successfully completed executionsCountSumRegion, StateMachineArnApplicable
ExecutionsTimedOutThe number of executions that time out for any reasonCountSumRegion, StateMachineArnApplicable
LambdaFunctionRunTimeThe interval, in milliseconds, between the time the Lambda function starts and the time it closesMillisecondsMultiRegion, LambdaFunctionArnApplicable
LambdaFunctionScheduleTimeThe interval, in milliseconds, for which the Lambda function stays in the schedule stateMillisecondsMultiRegion, LambdaFunctionArn
LambdaFunctionTimeThe interval, in milliseconds, between the time the Lambda function is scheduled and the time it closesMillisecondsMultiRegion, LambdaFunctionArn
LambdaFunctionsFailedThe number of failed Lambda functionsCountSumRegion, LambdaFunctionArnApplicable
LambdaFunctionsScheduledThe number of scheduled Lambda functionsCountSumRegion, LambdaFunctionArnApplicable
LambdaFunctionsStartedThe number of started Lambda functionsCountSumRegion, LambdaFunctionArn
LambdaFunctionsSucceededThe number of successfully completed Lambda functionsCountSumRegion, LambdaFunctionArnApplicable
LambdaFunctionsTimedOutThe number of Lambda functions that time out on closeCountSumRegion, LambdaFunctionArnApplicable
ProvisionedBucketSizeThe count of available requests per secondCountMultiRegion, ServiceMetric
ProvisionedBucketSizeCountMultiRegion, APIName
ProvisionedRefillRateThe count of requests per second that are allowed into the bucketCountMultiRegion, ServiceMetric
ProvisionedRefillRateCountMultiRegion, APIName
ServiceIntegrationRunTimeThe interval, in milliseconds, between the time the service task starts and the time it closesMillisecondsMultiRegion, ServiceIntegrationResourceArnApplicable
ServiceIntegrationScheduleTimeThe interval, in milliseconds, for which the service task stays in the schedule stateMillisecondsMultiRegion, ServiceIntegrationResourceArn
ServiceIntegrationTimeThe interval, in milliseconds, between the time the service task is scheduled and the time it closesMillisecondsMultiRegion, ServiceIntegrationResourceArn
ServiceIntegrationsFailedThe number of failed service tasksCountSumRegion, ServiceIntegrationResourceArnApplicable
ServiceIntegrationsScheduledThe number of scheduled service tasks.CountSumRegion, ServiceIntegrationResourceArnApplicable
ServiceIntegrationsStartedThe number of started service tasksCountSumRegion, ServiceIntegrationResourceArn
ServiceIntegrationsSucceededThe number of successfully completed service tasksCountSumRegion, ServiceIntegrationResourceArnApplicable
ServiceIntegrationsTimedOutThe number of service tasks that time out on closeCountSumRegion, ServiceIntegrationResourceArnApplicable
ThrottledEventsThe count of requests that have been throttledCountSumRegion, ServiceMetricApplicable
ThrottledEventsCountSumRegion, APINameApplicable

Limitations

Dynatrace gathers metrics for AWS Step Functions at the custom device group level instead of the custom device level (metrics are service-wide).

Related tags
Infrastructure Observability