Amazon SageMaker (Batch Transform Jobs, Endpoint Instances, Endpoints, Ground Truth, Processing Jobs, Training Jobs) monitoring

Dynatrace ingests metrics for multiple preselected namespaces, including Amazon SageMaker. You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.

Prerequisites

To enable monitoring for this service, you need

ActiveGate version 1.181+, as follows:
- For Dynatrace SaaS deployments, you need an Environment ActiveGate or a Multi-environment ActiveGate.
- For Dynatrace Managed deployments, you can use any kind of ActiveGate.
  
  For role-based access (whether in a SaaS or Managed deployment), you need an Environment ActiveGate installed on an Amazon EC2 host.
Dynatrace version 1.182+
An updated AWS monitoring policy to include the additional AWS services.
To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "acm-pca:ListCertificateAuthorities",
        "apigateway:GET",
        "apprunner:ListServices",
        "appstream:DescribeFleets",
        "appsync:ListGraphqlApis",
        "athena:ListWorkGroups",
        "autoscaling:DescribeAutoScalingGroups",
        "cloudformation:ListStackResources",
        "cloudfront:ListDistributions",
        "cloudhsm:DescribeClusters",
        "cloudsearch:DescribeDomains",
        "cloudwatch:GetMetricData",
        "cloudwatch:GetMetricStatistics",
        "cloudwatch:ListMetrics",
        "codebuild:ListProjects",
        "datasync:ListTasks",
        "dax:DescribeClusters",
        "directconnect:DescribeConnections",
        "dms:DescribeReplicationInstances",
        "dynamodb:ListTables",
        "dynamodb:ListTagsOfResource",
        "ec2:DescribeAvailabilityZones",
        "ec2:DescribeInstances",
        "ec2:DescribeNatGateways",
        "ec2:DescribeSpotFleetRequests",
        "ec2:DescribeTransitGateways",
        "ec2:DescribeVolumes",
        "ec2:DescribeVpnConnections",
        "ecs:ListClusters",
        "eks:ListClusters",
        "elasticache:DescribeCacheClusters",
        "elasticbeanstalk:DescribeEnvironmentResources",
        "elasticbeanstalk:DescribeEnvironments",
        "elasticfilesystem:DescribeFileSystems",
        "elasticloadbalancing:DescribeInstanceHealth",
        "elasticloadbalancing:DescribeListeners",
        "elasticloadbalancing:DescribeLoadBalancers",
        "elasticloadbalancing:DescribeRules",
        "elasticloadbalancing:DescribeTags",
        "elasticloadbalancing:DescribeTargetHealth",
        "elasticmapreduce:ListClusters",
        "elastictranscoder:ListPipelines",
        "es:ListDomainNames",
        "events:ListEventBuses",
        "firehose:ListDeliveryStreams",
        "fsx:DescribeFileSystems",
        "gamelift:ListFleets",
        "glue:GetJobs",
        "inspector:ListAssessmentTemplates",
        "kafka:ListClusters",
        "kinesis:ListStreams",
        "kinesisanalytics:ListApplications",
        "kinesisvideo:ListStreams",
        "lambda:ListFunctions",
        "lambda:ListTags",
        "lex:GetBots",
        "logs:DescribeLogGroups",
        "mediaconnect:ListFlows",
        "mediaconvert:DescribeEndpoints",
        "mediapackage-vod:ListPackagingConfigurations",
        "mediapackage:ListChannels",
        "mediatailor:ListPlaybackConfigurations",
        "opsworks:DescribeStacks",
        "qldb:ListLedgers",
        "rds:DescribeDBClusters",
        "rds:DescribeDBInstances",
        "rds:DescribeEvents",
        "rds:ListTagsForResource",
        "redshift:DescribeClusters",
        "robomaker:ListSimulationJobs",
        "route53:ListHostedZones",
        "route53resolver:ListResolverEndpoints",
        "s3:ListAllMyBuckets",
        "sagemaker:ListEndpoints",
        "sns:ListTopics",
        "sqs:ListQueues",
        "storagegateway:ListGateways",
        "sts:GetCallerIdentity",
        "swf:ListDomains",
        "tag:GetResources",
        "tag:GetTagKeys",
        "transfer:ListServers",
        "workmail:ListOrganizations",
        "workspaces:DescribeWorkspaces"
      ],
      "Resource": "*"
    }
  ]
}

If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for All AWS cloud services and, for each supporting service, a list of optional permissions specific to that service.

Permissions required for AWS monitoring integration:

"cloudwatch:GetMetricData"
"cloudwatch:GetMetricStatistics"
"cloudwatch:ListMetrics"
"sts:GetCallerIdentity"
"tag:GetResources"
"tag:GetTagKeys"
"ec2:DescribeAvailabilityZones"

Name

Permissions

All monitored Amazon services required

cloudwatch:GetMetricData,
cloudwatch:GetMetricStatistics,
cloudwatch:ListMetrics,
sts:GetCallerIdentity,
tag:GetResources,
tag:GetTagKeys,
ec2:DescribeAvailabilityZones

AWS Certificate Manager Private Certificate Authority

acm-pca:ListCertificateAuthorities

Amazon MQ

Amazon API Gateway

apigateway:GET

AWS App Runner

apprunner:ListServices

Amazon AppStream

appstream:DescribeFleets

AWS AppSync

appsync:ListGraphqlApis

Amazon Athena

athena:ListWorkGroups

Amazon Aurora

rds:DescribeDBClusters

Amazon EC2 Auto Scaling

autoscaling:DescribeAutoScalingGroups

Amazon EC2 Auto Scaling (built-in)

autoscaling:DescribeAutoScalingGroups

AWS Billing

Amazon Keyspaces

AWS Chatbot

Amazon CloudFront

cloudfront:ListDistributions

AWS CloudHSM

cloudhsm:DescribeClusters

Amazon CloudSearch

cloudsearch:DescribeDomains

AWS CodeBuild

codebuild:ListProjects

Amazon Cognito

Amazon Connect

Amazon Elastic Kubernetes Service (EKS)

eks:ListClusters

AWS DataSync

datasync:ListTasks

Amazon DynamoDB Accelerator (DAX)

dax:DescribeClusters

AWS Database Migration Service (AWS DMS)

dms:DescribeReplicationInstances

Amazon DocumentDB

rds:DescribeDBClusters

AWS Direct Connect

directconnect:DescribeConnections

Amazon DynamoDB

dynamodb:ListTables

Amazon DynamoDB (built-in)

dynamodb:ListTables,
dynamodb:ListTagsOfResource

Amazon EBS

ec2:DescribeVolumes

Amazon EBS (built-in)

ec2:DescribeVolumes

Amazon EC2 API

Amazon EC2 (built-in)

ec2:DescribeInstances

Amazon EC2 Spot Fleet

ec2:DescribeSpotFleetRequests

Amazon Elastic Container Service (ECS)

ecs:ListClusters

Amazon ECS Container Insights

ecs:ListClusters

Amazon ElastiCache (EC)

elasticache:DescribeCacheClusters

AWS Elastic Beanstalk

elasticbeanstalk:DescribeEnvironments

Amazon Elastic File System (EFS)

elasticfilesystem:DescribeFileSystems

Amazon Elastic Inference

Amazon Elastic Map Reduce (EMR)

elasticmapreduce:ListClusters

Amazon Elasticsearch Service (ES)

es:ListDomainNames

Amazon Elastic Transcoder

elastictranscoder:ListPipelines

Amazon Elastic Load Balancer (ELB) (built-in)

elasticloadbalancing:DescribeInstanceHealth,
elasticloadbalancing:DescribeListeners,
elasticloadbalancing:DescribeLoadBalancers,
elasticloadbalancing:DescribeRules,
elasticloadbalancing:DescribeTags,
elasticloadbalancing:DescribeTargetHealth

Amazon EventBridge

events:ListEventBuses

Amazon FSx

fsx:DescribeFileSystems

Amazon GameLift

gamelift:ListFleets

AWS Glue

glue:GetJobs

Amazon Inspector

inspector:ListAssessmentTemplates

AWS Internet of Things (IoT)

AWS IoT Analytics

Amazon Managed Streaming for Kafka

kafka:ListClusters

Amazon Kinesis Data Analytics

kinesisanalytics:ListApplications

Amazon Data Firehose

firehose:ListDeliveryStreams

Amazon Kinesis Data Streams

kinesis:ListStreams

Amazon Kinesis Video Streams

kinesisvideo:ListStreams

AWS Lambda

lambda:ListFunctions

AWS Lambda (built-in)

lambda:ListFunctions,
lambda:ListTags

Amazon Lex

lex:GetBots

Amazon Application and Network Load Balancer (built-in)

Amazon CloudWatch Logs

logs:DescribeLogGroups

AWS Elemental MediaConnect

mediaconnect:ListFlows

AWS Elemental MediaConvert

mediaconvert:DescribeEndpoints

AWS Elemental MediaPackage Live

mediapackage:ListChannels

AWS Elemental MediaPackage Video on Demand

mediapackage-vod:ListPackagingConfigurations

AWS Elemental MediaTailor

mediatailor:ListPlaybackConfigurations

Amazon VPC NAT Gateways

ec2:DescribeNatGateways

Amazon Neptune

rds:DescribeDBClusters

AWS OpsWorks

opsworks:DescribeStacks

Amazon Polly

Amazon QLDB

qldb:ListLedgers

Amazon RDS

rds:DescribeDBInstances

Amazon RDS (built-in)

rds:DescribeDBInstances,
rds:DescribeEvents,
rds:ListTagsForResource

Amazon Redshift

redshift:DescribeClusters

Amazon Rekognition

AWS RoboMaker

robomaker:ListSimulationJobs

Amazon Route 53

route53:ListHostedZones

Amazon Route 53 Resolver

route53resolver:ListResolverEndpoints

Amazon S3

s3:ListAllMyBuckets

Amazon S3 (built-in)

s3:ListAllMyBuckets

Amazon SageMaker Batch Transform Jobs

Amazon SageMaker Endpoint Instances

sagemaker:ListEndpoints

Amazon SageMaker Endpoints

sagemaker:ListEndpoints

Amazon SageMaker Ground Truth

Amazon SageMaker Processing Jobs

Amazon SageMaker Training Jobs

AWS Service Catalog

Amazon Simple Email Service (SES)

Amazon Simple Notification Service (SNS)

sns:ListTopics

Amazon Simple Queue Service (SQS)

sqs:ListQueues

AWS Systems Manager - Run Command

AWS Step Functions

AWS Storage Gateway

storagegateway:ListGateways

Amazon SWF

swf:ListDomains

Amazon Textract

AWS IoT Things Graph

AWS Transfer Family

transfer:ListServers

AWS Transit Gateway

ec2:DescribeTransitGateways

Amazon Translate

AWS Trusted Advisor

AWS API Usage

AWS Site-to-Site VPN

ec2:DescribeVpnConnections

AWS WAF Classic

AWS WAF

Amazon WorkMail

workmail:ListOrganizations

Amazon WorkSpaces

workspaces:DescribeWorkspaces

Example of JSON policy for one single service.

{
  "Version": "2012-10-17",
  "Statement": [
          {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": [
                          "apigateway:GET",
                          "cloudwatch:GetMetricData",
                          "cloudwatch:GetMetricStatistics",
                          "cloudwatch:ListMetrics",
                          "sts:GetCallerIdentity",
                          "tag:GetResources",
                          "tag:GetTagKeys",
                          "ec2:DescribeAvailabilityZones"
                  ],
                  "Resource": "*"
          }
      ]
}

In this example, from the complete list of permissions you need to select

"apigateway:GET" for Amazon API Gateway
"cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics", "sts:GetCallerIdentity", "tag:GetResources", "tag:GetTagKeys", and "ec2:DescribeAvailabilityZones" for All AWS cloud services.

Endpoint

Service

autoscaling.<REGION>.amazonaws.com

Amazon EC2 Auto Scaling (built-in), Amazon EC2 Auto Scaling

lambda.<REGION>.amazonaws.com

AWS Lambda (built-in), AWS Lambda

elasticloadbalancing.<REGION>.amazonaws.com

Amazon Application and Network Load Balancer (built-in), Amazon Elastic Load Balancer (ELB) (built-in)

dynamodb.<REGION>.amazonaws.com

Amazon DynamoDB (built-in), Amazon DynamoDB

ec2.<REGION>.amazonaws.com

Amazon EBS (built-in), Amazon EC2 (built-in), Amazon EBS, Amazon EC2 Spot Fleet, Amazon VPC NAT Gateways, AWS Transit Gateway, AWS Site-to-Site VPN

rds.<REGION>.amazonaws.com

Amazon RDS (built-in), Amazon Aurora, Amazon DocumentDB, Amazon Neptune, Amazon RDS

s3.<REGION>.amazonaws.com

Amazon S3 (built-in)

acm-pca.<REGION>.amazonaws.com

AWS Certificate Manager Private Certificate Authority

apigateway.<REGION>.amazonaws.com

Amazon API Gateway

apprunner.<REGION>.amazonaws.com

AWS App Runner

appstream2.<REGION>.amazonaws.com

Amazon AppStream

appsync.<REGION>.amazonaws.com

AWS AppSync

athena.<REGION>.amazonaws.com

Amazon Athena

cloudfront.amazonaws.com

Amazon CloudFront

cloudhsmv2.<REGION>.amazonaws.com

AWS CloudHSM

cloudsearch.<REGION>.amazonaws.com

Amazon CloudSearch

codebuild.<REGION>.amazonaws.com

AWS CodeBuild

datasync.<REGION>.amazonaws.com

AWS DataSync

dax.<REGION>.amazonaws.com

Amazon DynamoDB Accelerator (DAX)

dms.<REGION>.amazonaws.com

AWS Database Migration Service (AWS DMS)

directconnect.<REGION>.amazonaws.com

AWS Direct Connect

ecs.<REGION>.amazonaws.com

Amazon Elastic Container Service (ECS), Amazon ECS Container Insights

elasticfilesystem.<REGION>.amazonaws.com

Amazon Elastic File System (EFS)

eks.<REGION>.amazonaws.com

Amazon Elastic Kubernetes Service (EKS)

elasticache.<REGION>.amazonaws.com

Amazon ElastiCache (EC)

elasticbeanstalk.<REGION>.amazonaws.com

AWS Elastic Beanstalk

elastictranscoder.<REGION>.amazonaws.com

Amazon Elastic Transcoder

es.<REGION>.amazonaws.com

Amazon Elasticsearch Service (ES)

events.<REGION>.amazonaws.com

Amazon EventBridge

fsx.<REGION>.amazonaws.com

Amazon FSx

gamelift.<REGION>.amazonaws.com

Amazon GameLift

glue.<REGION>.amazonaws.com

AWS Glue

inspector.<REGION>.amazonaws.com

Amazon Inspector

kafka.<REGION>.amazonaws.com

Amazon Managed Streaming for Kafka

models.lex.<REGION>.amazonaws.com

Amazon Lex

logs.<REGION>.amazonaws.com

Amazon CloudWatch Logs

api.mediatailor.<REGION>.amazonaws.com

AWS Elemental MediaTailor

mediaconnect.<REGION>.amazonaws.com

AWS Elemental MediaConnect

mediapackage.<REGION>.amazonaws.com

AWS Elemental MediaPackage Live

mediapackage-vod.<REGION>.amazonaws.com

AWS Elemental MediaPackage Video on Demand

opsworks.<REGION>.amazonaws.com

AWS OpsWorks

qldb.<REGION>.amazonaws.com

Amazon QLDB

redshift.<REGION>.amazonaws.com

Amazon Redshift

robomaker.<REGION>.amazonaws.com

AWS RoboMaker

route53.amazonaws.com

Amazon Route 53

route53resolver.<REGION>.amazonaws.com

Amazon Route 53 Resolver

api.sagemaker.<REGION>.amazonaws.com

Amazon SageMaker Endpoints, Amazon SageMaker Endpoint Instances

sns.<REGION>.amazonaws.com

Amazon Simple Notification Service (SNS)

sqs.<REGION>.amazonaws.com

Amazon Simple Queue Service (SQS)

storagegateway.<REGION>.amazonaws.com

AWS Storage Gateway

swf.<REGION>.amazonaws.com

Amazon SWF

transfer.<REGION>.amazonaws.com

AWS Transfer Family

workmail.<REGION>.amazonaws.com

Amazon WorkMail

workspaces.<REGION>.amazonaws.com

Amazon WorkSpaces

Enable monitoring

To learn how to enable service monitoring, see Enable service monitoring.

View service metrics

You can view the service metrics in your Dynatrace environment either on the custom device overview page or on your Dashboards page.

View metrics on the custom device overview page

To access the custom device overview page

Go to Technologies & Processes or Technologies & Processes Classic (latest Dynatrace).
Filter by service name and select the relevant custom device group.
Once you select the custom device group, you're on the custom device group overview page.
The custom device group overview page lists all instances (custom devices) belonging to the group. Select an instance to view the custom device overview page.

View metrics on your dashboard

You can also view metrics in the Dynatrace web UI on dashboards. There is no preset dashboard available for this service, but you can create your own dashboard.

To check the availability of preset dashboards for each AWS service, see the list below.

AWS service

Preset dashboard

Amazon EC2 Auto Scaling (built-in)

AWS Lambda (built-in)

Amazon Application and Network Load Balancer (built-in)

Amazon DynamoDB (built-in)

Amazon EBS (built-in)

Amazon EC2 (built-in)

Amazon Elastic Load Balancer (ELB) (built-in)

Amazon RDS (built-in)

Amazon S3 (built-in)

AWS Certificate Manager Private Certificate Authority

All monitored Amazon services

Amazon API Gateway

AWS App Runner

Amazon AppStream

AWS AppSync

Amazon Athena

Amazon Aurora

Amazon EC2 Auto Scaling

AWS Billing

Amazon Keyspaces

AWS Chatbot

Amazon CloudFront

AWS CloudHSM

Amazon CloudSearch

AWS CodeBuild

Amazon Cognito

Amazon Connect

AWS DataSync

Amazon DynamoDB Accelerator (DAX)

AWS Database Migration Service (AWS DMS)

Amazon DocumentDB

AWS Direct Connect

Amazon DynamoDB

Amazon EBS

Amazon EC2 Spot Fleet

Amazon EC2 API

Amazon Elastic Container Service (ECS)

Amazon ECS Container Insights

Amazon Elastic File System (EFS)

Amazon Elastic Kubernetes Service (EKS)

Amazon ElastiCache (EC)

AWS Elastic Beanstalk

Amazon Elastic Inference

Amazon Elastic Transcoder

Amazon Elastic Map Reduce (EMR)

Amazon Elasticsearch Service (ES)

Amazon EventBridge

Amazon FSx

Amazon GameLift

AWS Glue

Amazon Inspector

AWS Internet of Things (IoT)

AWS IoT Things Graph

AWS IoT Analytics

Amazon Managed Streaming for Kafka

Amazon Kinesis Data Analytics

Amazon Data Firehose

Amazon Kinesis Data Streams

Amazon Kinesis Video Streams

AWS Lambda

Amazon Lex

Amazon CloudWatch Logs

AWS Elemental MediaTailor

AWS Elemental MediaConnect

AWS Elemental MediaConvert

AWS Elemental MediaPackage Live

AWS Elemental MediaPackage Video on Demand

Amazon MQ

Amazon VPC NAT Gateways

Amazon Neptune

AWS OpsWorks

Amazon Polly

Amazon QLDB

Amazon RDS

Amazon Redshift

Amazon Rekognition

AWS RoboMaker

Amazon Route 53

Amazon Route 53 Resolver

Amazon S3

Amazon SageMaker Batch Transform Jobs

Amazon SageMaker Endpoints

Amazon SageMaker Endpoint Instances

Amazon SageMaker Ground Truth

Amazon SageMaker Processing Jobs

Amazon SageMaker Training Jobs

AWS Service Catalog

Amazon Simple Email Service (SES)

Amazon Simple Notification Service (SNS)

Amazon Simple Queue Service (SQS)

AWS Systems Manager - Run Command

AWS Step Functions

AWS Storage Gateway

Amazon SWF

Amazon Textract

AWS Transfer Family

AWS Transit Gateway

Amazon Translate

AWS Trusted Advisor

AWS API Usage

AWS Site-to-Site VPN

AWS WAF Classic

AWS WAF

Amazon WorkMail

Amazon WorkSpaces

Available metrics

Amazon SageMaker Batch Transform Jobs

Name

Description

Unit

Statistics

Dimensions

Recommended

CPUUtilization

The percentage of CPU units that are used by the containers on an instance. The value can range between 0% and 100%, and is multiplied by the number of CPUs. For example, if there are four CPUs, CPUUtilization can range from 0% to `400%'.

Percent

Average

Region, Host

MemoryUtilization

The percentage of memory that is used by the containers on an instance. This value can range between 0% and 100%.

Percent

Average

Region, Host

GPUMemoryUtilization

The percentage of GPU memory used by the containers on an instance. The value can range between 0% and 100% and is multiplied by the number of GPUs. For example, if there are four GPUs, GPUMemoryUtilization can range from 0% to `400%'.

Percent

Average

Region, Host

GPUUtilization

The percentage of GPU units that are used by the containers on an instance. The value can range between 0% and 100%and is multiplied by the number of GPUs. For example, if there are four GPUs, GPUUtilization can range from 0% to `400%'.

Percent

Average

Region, Host

Amazon SageMaker Processing Jobs, Amazon SageMaker Training Jobs

Name

Description

Unit

Statistics

Dimensions

Recommended

CPUUtilization

Percent

Average

Region, Host

DiskUtilization

The percentage of disk space used by the containers on an instance uses. This value can range between 0% and 100%. This metric is not supported for batch transform jobs.

Percent

Average

EndpointName, VariantName

GPUMemoryUtilization

Percent

Average

Region, Host

GPUUtilization

Percent

Average

Region, Host

MemoryUtilization

The percentage of memory that is used by the containers on an instance. This value can range between 0% and 100%.

Percent

Average

Region, Host

Amazon SageMaker Endpoint Instances

EndpointName is the main dimension.

Name

Description

Unit

Statistics

Dimensions

Recommended

CPUUtilization

Percent

Average

EndpointName, VariantName

DiskUtilization

The percentage of disk space used by the containers on an instance uses. This value can range between 0% and 100%. This metric is not supported for batch transform jobs.

Percent

Average

EndpointName, VariantName

GPUMemoryUtilization

Percent

Average

EndpointName, VariantName

GPUUtilization

Percent

Average

EndpointName, VariantName

LoadedModelCount

The number of models loaded in the containers of the multi-model endpoint. This metric is emitted per instance.

None

Average

EndpointName, VariantName

LoadedModelCount

None

Sum

EndpointName, VariantName

MemoryUtilization

The percentage of memory that is used by the containers on an instance. This value can range between 0% and 100%.

Percent

Average

EndpointName, VariantName

Amazon SageMaker Endpoints

EndpointName is the main dimension.

Name

Description

Unit

Statistics

Dimensions

Recommended

Invocation4XXErrors

The number of InvokeEndpoint requests where the model returned a 4xx HTTP response code. For each 4xx response, 1 is sent; otherwise, 0 is sent.

None

Average

EndpointName, VariantName

Invocation4XXErrors

None

Sum

EndpointName, VariantName

Invocation5XXErrors

The number of InvokeEndpoint requests where the model returned a 5xx HTTP response code. For each 5xx response, 1 is sent; otherwise, 0 is sent.

None

Average

EndpointName, VariantName

Invocation5XXErrors

None

Sum

EndpointName, VariantName

Invocations

The number of InvokeEndpoint requests sent to a model endpoint

None

Sum

EndpointName, VariantName

Invocations

None

Count

EndpointName, VariantName

InvocationsPerInstance

The number of invocations sent to a model, normalized by InstanceCount in each ProductionVariant. 1/numberOfInstances is sent as the value on each request, where numberOfInstances is the number of active instances for the ProductionVariant behind the endpoint at the time of the request.

None

Sum

EndpointName, VariantName

ModelCacheHit

The number of InvokeEndpoint requests sent to the multi-model endpoint for which the model was already loaded

None

Sum

EndpointName, VariantName

ModelCacheHit

None

Average

EndpointName, VariantName

ModelCacheHit

None

Count

EndpointName, VariantName

ModelLatency

The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the local communication times taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container.

Microseconds

Multi

EndpointName, VariantName

ModelLatency

Microseconds

Sum

EndpointName, VariantName

ModelLatency

Microseconds

Count

EndpointName, VariantName

ModelLoadingTime

The interval of time that it took to load the model through the container's LoadModel API call.

Microseconds

Multi

EndpointName, VariantName

ModelLoadingTime

Microseconds

Sum

EndpointName, VariantName

ModelLoadingTime

Microseconds

Count

EndpointName, VariantName

ModelLoadingWaitTime

The interval of time that an invocation request has waited for the target model to be downloaded, or loaded, or both in order to perform inference

Microseconds

Multi

EndpointName, VariantName

ModelLoadingWaitTime

Microseconds

Sum

EndpointName, VariantName

ModelLoadingWaitTime

Microseconds

Count

EndpointName, VariantName

ModelDownloadingTime

The interval of time that it took to download the model from Amazon Simple Storage Service (Amazon S3)

Microseconds

Multi

EndpointName, VariantName

ModelDownloadingTime

Microseconds

Sum

EndpointName, VariantName

ModelDownloadingTime

Microseconds

Count

EndpointName, VariantName

ModelUnloadingTime

The interval of time that it took to unload the model through the container's UnloadModel API call

Microseconds

Multi

EndpointName, VariantName

ModelUnloadingTime

Microseconds

Sum

EndpointName, VariantName

ModelUnloadingTime

Microseconds

Count

EndpointName, VariantName

OverheadLatency

The interval of time added to the time taken to respond to a client request by SageMaker overheads. This interval is measured from the time SageMaker receives the request until it returns a response to the client, minus the ModelLatency.

Microseconds

Multi

EndpointName, VariantName

OverheadLatency

Microseconds

Sum

EndpointName, VariantName

OverheadLatency

Microseconds

Count

EndpointName, VariantName

Amazon SageMaker Ground Truth

Name

Description

Dimensions

Statistics

Unit

Recommended

ActiveWorkers

The number of workers on a private work team performing a labeling job

Region, LabelingJobName

Maximum

None

DatasetObjectsAutoAnnotated

The number of dataset objects auto-annotated in a labeling job. This metric is only emitted when automated labeling is enabled.

Region, LabelingJobName

Maximum

None

DatasetObjectsHumanAnnotated

The number of dataset objects annotated by a human in a labeling job

Region, LabelingJobName

Maximum

None

DatasetObjectsLabelingFailed

The number of dataset objects that failed labeling in a labeling job

Region, LabelingJobName

Maximum

None

JobsFailed

The number of labeling jobs that failed

Region

Count

None

JobsFailed

Region

Sum

None

JobsStopped

The number of labeling jobs that were stopped

Region

Count

None

JobsStopped

Region

Sum

None

JobsSucceeded

The number of labeling jobs that succeeded

Region

Count

None

JobsSucceeded

Region

Sum

None

TasksSubmitted

The number of tasks submitted/completed by a private work team

Region, LabelingJobName

Maximum

None

TimeSpent

Time spent on a task completed by a private work team

Region, LabelingJobName

Maximum

Seconds

TotalDatasetObjectsLabeled

The number of dataset objects labeled successfully in a labeling job

Region, LabelingJobName

Maximum

None