CLF-C02Chapter 117 of 130Objective 3.5

AWS AI/ML Services Overview

This chapter covers AWS's artificial intelligence (AI) and machine learning (ML) services, a key part of Domain 3: Cloud Technology Services (Objective 3.5, ~5-10% of the exam). You'll learn the core services like Amazon SageMaker, Rekognition, Comprehend, Polly, Lex, Translate, and Textract, and how they solve common business problems without requiring deep ML expertise. The CLF-C02 exam focuses on recognizing the right service for a given scenario and understanding the shared responsibility model for AI/ML.

25 min read
Beginner
Updated May 31, 2026

AI/ML as a Skilled Chef and Recipe Book

Imagine you want to open a restaurant but have no cooking experience. Traditionally, you'd need to hire a world-class chef (a data scientist), buy a kitchen (infrastructure), and develop recipes from scratch (train models). This is expensive and slow. AWS AI/ML services are like a pre-stocked kitchen with recipe books and automated appliances. Amazon SageMaker is your master chef that helps you create new recipes (train custom models) by providing ingredients (data) and following step-by-step instructions (algorithms). But if you just want to serve a popular dish like pizza, you can use a frozen pizza from the freezer — that's Amazon Rekognition or Comprehend, pre-trained AI services that work out of the box. For very simple tasks like asking "What's the weather?" you can use a voice assistant like Alexa — that's Amazon Lex. The key mechanism: the more you customize, the more data and compute you need. Just like a frozen pizza is cheap and fast, but a custom recipe takes time and effort. AWS charges you for the ingredients (compute hours, data storage) and the chef's time (training). The exam tests which service to choose based on how much customization you need.

How It Actually Works

What is AI/ML and Why Does AWS Offer It?

AI (Artificial Intelligence) simulates human intelligence in machines, while ML (Machine Learning) is a subset where models learn from data without explicit programming. AWS provides AI/ML services so customers can add intelligence to applications without hiring a team of data scientists or managing GPU clusters. The problem AWS solves: building and deploying ML models is complex, time-consuming, and expensive. AWS offers three tiers: AI Services (pre-trained APIs), ML Services (managed platforms like SageMaker), and ML Frameworks (DIY with infrastructure).

How AWS AI/ML Services Work

AWS AI/ML services follow a consumption-based model. For pre-trained AI services, you send data (text, image, audio) via an API, and AWS returns predictions. For example, Amazon Rekognition can detect objects in an image — you upload the image to S3 and call the DetectLabels API. Under the hood, AWS uses deep neural networks trained on massive datasets. For custom ML, Amazon SageMaker provides a fully managed environment to build, train, and deploy models. You provide labeled data, choose an algorithm, and SageMaker spins up compute instances (like ml.m5.large) for training. After training, you deploy the model to an endpoint for real-time inference. Pricing varies: per-image for Rekognition, per-character for Textract, per-hour for SageMaker instances.

Key Services Overview

Amazon SageMaker: A complete ML platform. You can use built-in algorithms (e.g., XGBoost) or bring your own. It includes data labeling (Ground Truth), training, tuning, and deployment. Pricing: pay for instance hours, data storage, and data processing.

Amazon Rekognition: Image and video analysis. Detect objects, faces, text, and unsafe content. Common use: moderation of user-uploaded images. Pricing: per 1,000 images analyzed.

Amazon Comprehend: Natural Language Processing (NLP). Extracts entities, key phrases, sentiment, and language. Used for analyzing customer feedback. Pricing: per 100 characters.

Amazon Polly: Text-to-speech. Converts text to lifelike speech with multiple voices and languages. Pricing: per million characters.

Amazon Lex: Conversational interfaces (chatbots). Powered by the same technology as Alexa. Integrates with Lambda for business logic. Pricing: per text or voice request.

Amazon Translate: Neural machine translation. Supports dozens of languages. Pricing: per character.

Amazon Textract: Extracts text, handwriting, and data from scanned documents. Goes beyond OCR by understanding forms and tables. Pricing: per page.

Amazon Transcribe: Automatic speech recognition (ASR). Converts audio to text. Supports custom vocabularies. Pricing: per second of audio.

Comparison to On-Premises or Competitors

On-premises ML requires purchasing GPUs, setting up frameworks like TensorFlow, and managing data pipelines. AWS eliminates this overhead. Compared to competitors like Google Cloud AI or Azure Cognitive Services, AWS services are tightly integrated with S3, Lambda, and other services. For example, you can set up a serverless pipeline: S3 upload triggers Lambda, which calls Rekognition and stores results in DynamoDB. This is a common exam scenario.

When to Use Each Service

Use pre-trained AI services (Rekognition, Comprehend, etc.) when you need standard capabilities like object detection or sentiment analysis and don't want to train a custom model. Use SageMaker when you have unique data and need a model tailored to your business, like predicting customer churn. For real-time voice interaction, use Lex. For batch translation, use Translate. For document processing, use Textract. The exam will ask you to choose the right service for a given task.

Walk-Through

1

Select the AI/ML Service

First, identify the business need. If you need to extract text from a scanned PDF, Amazon Textract is the correct choice. If you need to moderate images, use Amazon Rekognition. If you need a chatbot, use Amazon Lex. The CLF-C02 exam expects you to match the service to the use case. For example, a question might describe a company that wants to automatically detect negative sentiment in customer emails. The correct answer is Amazon Comprehend. A common trap is choosing SageMaker because it's the most well-known, but SageMaker is for custom models, not pre-trained sentiment analysis.

2

Configure Data Source and Permissions

Most AI services require data stored in Amazon S3. You must ensure the service has permissions to access the bucket via IAM roles. For example, to use Rekognition, you create an IAM role with a policy that allows Rekognition to read from S3. Similarly, for Textract, the role must allow access to S3. The exam may test that you need to set up an IAM role, not just upload files. Also, some services like Lex require a Lambda function for business logic, so you must create that as well.

3

Invoke the Service via API or Console

You can call AWS AI services using the AWS CLI, SDK, or the AWS Management Console. For example, to analyze an image with Rekognition, you can use the CLI command: `aws rekognition detect-labels --image '{"S3Object":{"Bucket":"my-bucket","Name":"photo.jpg"}}'`. The service returns a JSON response with labels and confidence scores. For SageMaker, you train a model using the console or SDK, then deploy an endpoint. The exam does not require memorizing CLI commands but expects you to understand the flow: input -> service -> output.

4

Process and Interpret Results

After invoking the service, you receive structured data. For example, Rekognition returns labels like 'Cat' with confidence 99.8%. Comprehend returns sentiment (Positive, Negative, Neutral, Mixed) and entities. You can then store results in DynamoDB, send to an SQS queue, or trigger another Lambda. The exam may ask what to do with the output — common patterns include storing in a database or sending alerts. For example, if Rekognition detects unsafe content, you can automatically delete the image and notify an admin.

5

Monitor and Optimize Costs

AWS AI services are pay-as-you-go. You should monitor usage with CloudWatch metrics and set budgets. For example, Rekognition charges per image; if you process millions of images, costs can add up. Use the AWS Cost Explorer to track spending. For SageMaker, you pay for training and endpoint hours; you can use spot instances to reduce training costs. The exam may test that you can use AWS Budgets to set alerts. Also, consider using a serverless architecture to scale automatically and only pay for what you use.

What This Looks Like on the Job

Scenario 1: Social Media Content Moderation

A social media platform allows users to upload photos. To comply with regulations and maintain community standards, they need to automatically detect inappropriate content (violence, nudity, etc.). They use Amazon Rekognition's DetectModerationLabel API. Each uploaded image is stored in S3. A Lambda function is triggered on new S3 objects, calls Rekognition, and if an unsafe label is detected, the image is quarantined and an admin is notified via SNS. Cost: $1 per 1,000 images. Misconfiguration: If the IAM role doesn't grant Rekognition access to S3, the API call fails. Also, if the Lambda timeout is too short, large images may not be processed.

Scenario 2: Customer Support Ticket Analysis

A company receives thousands of support tickets daily. They use Amazon Comprehend to analyze sentiment and extract entities (product names, issues). Tickets are stored in S3. A scheduled Lambda reads new tickets, calls Comprehend, and stores sentiment scores in DynamoDB. A dashboard shows trending negative issues. Cost: $0.0001 per 100 characters. Misunderstanding: Some think they need SageMaker to build a custom sentiment model, but Comprehend's pre-trained model works well for general use. The exam tests this distinction.

Scenario 3: Document Processing for Insurance Claims

An insurance company processes claim forms. They use Amazon Textract to extract text and data from scanned PDFs. Textract can extract key-value pairs (e.g., 'Policy Number: 12345'). The extracted data is stored in DynamoDB and fed into a claims processing system. Cost: $1.50 per 1,000 pages. Pitfall: Textract has a default limit of 20 pages per document; for larger documents, you must split or request a limit increase. Also, handwriting recognition accuracy is lower for cursive scripts.

How CLF-C02 Actually Tests This

What CLF-C02 Tests on Objective 3.5

Domain 3 (Cloud Technology Services) includes AI/ML services. The exam expects you to:

Identify the correct AWS AI/ML service for a given use case.

Understand the difference between pre-trained AI services and custom ML with SageMaker.

Recognize that you don't need ML expertise to use AI services.

Know that these services integrate with other AWS services (S3, Lambda, etc.).

Understand basic pricing: per-request, per-character, per-hour.

Common Wrong Answers and Why

1.

Choosing SageMaker for everything – Candidates see 'machine learning' in the question and immediately pick SageMaker. But if the question says 'analyze sentiment in customer reviews', the correct answer is Comprehend, not SageMaker. SageMaker is for custom models, not pre-trained.

2.

Confusing Rekognition and Textract – Both deal with images, but Rekognition is for object/face detection, while Textract is for extracting text from documents. A question about reading text from a scanned form should pick Textract, not Rekognition.

3.

Selecting Polly for speech-to-text – Polly is text-to-speech, not speech-to-text. The correct service for speech-to-text is Amazon Transcribe. Candidates often mix them up.

4.

Thinking Lex is only for voice – Lex supports both text and voice. The exam may test that Lex can handle chat and voice interactions.

Tricky Distinctions

Rekognition vs. Textract: Rekognition can detect text in images (e.g., a sign), but Textract is optimized for documents (forms, tables). If the question says 'extract data from a table in a PDF', choose Textract.

Comprehend vs. Translate: Comprehend understands language (sentiment, entities), Translate translates text. If the question is about understanding meaning, choose Comprehend; if about converting language, choose Translate.

Transcribe vs. Polly: Transcribe converts speech to text; Polly converts text to speech. Remember 'Transcribe' has 'scribe' (writing), Polly is the voice (like a parrot).

Decision Rule for Multiple Choice

When you see a question about AI/ML, ask: Is the task generic (sentiment, object detection, translation) or custom (predicting churn, classifying unique products)? If generic, pick a pre-trained AI service. If custom, pick SageMaker. Then narrow down by task: images? Rekognition. Text? Comprehend. Speech? Transcribe or Polly. Chat? Lex. Documents? Textract.

Key Takeaways

Pre-trained AI services (Rekognition, Comprehend, Polly, Lex, Translate, Textract, Transcribe) require no ML expertise and are invoked via APIs.

Amazon SageMaker is a fully managed platform for building, training, and deploying custom ML models.

Amazon Rekognition is for image/video analysis; Amazon Textract is for document text extraction.

Amazon Comprehend handles natural language understanding (sentiment, entities); Amazon Translate handles language translation.

Amazon Polly converts text to speech; Amazon Transcribe converts speech to text.

Amazon Lex powers conversational interfaces (chatbots) with both text and voice.

All AI services integrate with S3, Lambda, and IAM for data access and automation.

Pricing varies: per request (Rekognition), per character (Comprehend, Translate, Polly), per second (Transcribe), per page (Textract), per hour (SageMaker).

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Amazon Rekognition

Analyzes images and videos for objects, faces, and scenes.

Can detect text in images (e.g., signs) but not designed for document extraction.

Pricing: per 1,000 images analyzed.

Use case: content moderation, facial recognition.

Output: labels with confidence scores.

Amazon Textract

Extracts text, handwriting, and data from documents (PDFs, forms).

Can extract key-value pairs and tables from documents.

Pricing: per page processed.

Use case: processing scanned forms, invoices.

Output: structured data with bounding boxes.

Watch Out for These

Mistake

You need to be a data scientist to use any AWS AI service.

Correct

Pre-trained AI services like Rekognition and Comprehend require no ML expertise. You just call an API. Only SageMaker requires data science skills for custom models.

Mistake

Amazon SageMaker is a single service for all ML needs.

Correct

SageMaker is a platform with many components (Studio, Ground Truth, Clarify, etc.). The exam tests it as a managed ML service, but for simple tasks, pre-trained services are more appropriate.

Mistake

Amazon Polly can convert speech to text.

Correct

Polly converts text to speech. For speech-to-text, use Amazon Transcribe. The names are similar but functions are opposite.

Mistake

Amazon Lex only works with voice input.

Correct

Lex supports both text and voice. It can be used for chatbots that accept text messages or voice calls.

Mistake

AI services are free to use.

Correct

All AI services have a pay-as-you-go pricing model. Some have a free tier (e.g., Rekognition free tier: 5,000 images per month for 12 months). Beyond that, you pay per request or per hour.

Frequently Asked Questions

What is the difference between Amazon Rekognition and Amazon Textract?

Amazon Rekognition is for analyzing images and videos to detect objects, faces, scenes, and even text in natural scenes (like a street sign). Amazon Textract is specifically designed to extract text, handwriting, and structured data (tables, forms) from documents like PDFs and scanned images. If you need to read a form or a table, use Textract. If you need to detect objects in a photo, use Rekognition. Exam tip: If the question mentions 'document,' 'form,' or 'table,' the answer is likely Textract.

Do I need to train a model for Amazon Comprehend?

No, Amazon Comprehend is a pre-trained NLP service. You simply send text via API, and it returns sentiment, entities, key phrases, language, and more. However, Comprehend also offers Custom Entity Recognition if you need to identify domain-specific entities (e.g., product codes). For the CLF-C02 exam, treat Comprehend as a pre-trained service. You do not need to train it for standard use cases.

Can Amazon Lex handle both text and voice?

Yes, Amazon Lex supports both text and voice interactions. You can build a chatbot that accepts text messages or integrates with Amazon Connect for voice calls. The underlying technology is the same as Alexa. Lex uses automatic speech recognition (ASR) for voice and natural language understanding (NLU) to interpret intent. The exam may test that Lex supports both modalities.

What is the pricing model for Amazon SageMaker?

Amazon SageMaker pricing is based on the resources you use: you pay for the compute instances (ml.t3, ml.m5, etc.) per hour for training and hosting, plus data storage in S3 and EBS. You also pay for data processing in SageMaker Studio notebooks. There is no upfront cost. You can save money by using Spot Instances for training. The exam does not require exact prices but expects you to know it's pay-as-you-go.

How do AWS AI services handle data privacy?

AWS AI services are designed with data privacy in mind. You own your data, and AWS does not use your data to improve its models unless you opt in. Data is encrypted in transit and at rest. You can use AWS KMS for customer-managed keys. For services like Rekognition, you can configure the service to not store data. The exam may test that customer data is not used for training without permission.

What is the difference between Amazon Transcribe and Amazon Polly?

Amazon Transcribe converts speech to text (automatic speech recognition). Amazon Polly converts text to speech (text-to-speech). They are opposites. Transcribe is used for transcribing meetings, calls, or videos. Polly is used for creating voiceovers or audio responses. A common exam trap is mixing them up. Remember: Transcribe writes (transcription), Polly speaks.

Can I use Amazon Rekognition with video?

Yes, Amazon Rekognition supports video analysis via the StartLabelDetection, StartFaceDetection, and StartContentModeration APIs. You provide a video file stored in S3, and Rekognition processes it asynchronously. Results are sent to an SNS topic. For the exam, know that Rekognition can handle both images and videos, but video analysis is asynchronous.

Terms Worth Knowing

Ready to put this to the test?

You've just covered AWS AI/ML Services Overview — now see how well it sticks with free CLF-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?