A company wants to reduce costs for a SageMaker real-time endpoint that has variable traffic. Which feature allows the endpoint to automatically adjust instance count based on demand?
Auto Scaling adjusts instance count based on demand using target tracking or step scaling policies.
Why this answer
Application Auto Scaling for SageMaker endpoints allows dynamic adjustment of instance count based on CloudWatch metrics such as CPU utilization or invocations per instance.