AWS Certified AI Practitioner AIF-C01 AIF-C01 Questions 826–900 | Page 12/14

826

MCQhard

A data scientist is using Amazon Bedrock to generate product descriptions. They notice the output is often repetitive and lacks creativity. Which combination of parameter adjustments is MOST likely to produce more diverse and less repetitive output?

A.Decrease temperature and decrease top-p

B.Decrease temperature and increase top-p

C.Increase temperature and decrease top-p

D.Increase temperature and increase top-p

AnswerD

Higher temperature flattens probability distribution; higher top-p expands the set of candidate tokens, both promoting diversity.

Why this answer

Increasing temperature and top-p both encourage more diverse sampling. Reducing them would make output more deterministic and repetitive.

Full explanation →

827

MCQhard

A company wants to forecast product demand across thousands of SKUs with different demand patterns. They have 3 years of historical sales data, plus external factors like holidays and promotions. Which combination of AWS services and approach would deliver the most accurate forecasts with minimal manual effort?

A.Use Amazon Comprehend to analyze customer reviews and correlate with sales

B.Upload data to Amazon QuickSight and use its built-in forecasting widget

C.Use Amazon Forecast with the DeepAR+ algorithm and provide item metadata, holiday calendars, and promotion data

D.Train individual ARIMA models for each SKU using Amazon SageMaker built-in algorithms

AnswerC

Forecast is designed for scalable, accurate forecasting with built-in handling of external regressors.

Why this answer

Amazon Forecast is purpose-built for time-series forecasting and automatically handles multiple SKUs, holidays, and promotions. SageMaker would require building custom models from scratch. Comprehend is NLP, and QuickSight is for visualization.

Full explanation →

828

Multi-Selecthard

A company wants to evaluate the performance of a generative AI model before deployment. Which TWO metrics are most relevant for measuring model quality? (Select two.)

Select 2 answers

A.BLEU score

B.Response time

C.Perplexity

D.Model size

E.CPU utilization

AnswersA, C

BLEU evaluates the quality of generated text by comparing n-grams with reference translations.

Why this answer

Options A (BLEU score) and C (Perplexity) are standard for evaluating text generation quality. BLEU measures similarity to reference text, and perplexity measures how well the model predicts a sample. Option B (CPU utilization) is operational, not quality.

Option D (Response time) is latency. Option E (Model size) is a design parameter.

Full explanation →

829

MCQhard

A bank uses an AI system to detect fraudulent transactions. The model has high precision but low recall for small transactions, potentially missing fraud. Which approach aligns with responsible AI?

A.Send all flagged transactions to customers for confirmation

B.Focus only on precision to minimize false positives

C.Tune the model to achieve an acceptable balance between recall and precision

D.Increase the detection threshold to reduce false positives

AnswerC

Balancing metrics is a responsible approach.

Why this answer

Option C is correct because responsible AI requires balancing competing objectives like precision and recall to align with ethical principles and business needs. In fraud detection, high precision with low recall means many fraudulent transactions are missed, which can lead to significant financial losses and erode customer trust. Tuning the model to achieve an acceptable trade-off ensures that the system is both effective and fair, minimizing harm while maintaining operational viability.

Exam trap

The AIF-C01 exam often tests the misconception that increasing the detection threshold improves model performance overall, when in fact it only reduces false positives at the cost of lowering recall, which can be detrimental in high-stakes applications like fraud detection.

How to eliminate wrong answers

Option A is wrong because sending all flagged transactions to customers for confirmation shifts the burden to users, degrades user experience, and may not be scalable or timely for real-time fraud detection, nor does it address the underlying model imbalance. Option B is wrong because focusing only on precision ignores the critical need to catch actual fraud (recall), which can result in substantial financial losses and violates the responsible AI principle of beneficence. Option D is wrong because increasing the detection threshold reduces false positives but further lowers recall, worsening the problem of missed fraud and contradicting the goal of responsible AI.

Full explanation →

830

Multi-Selecthard

A company is evaluating the performance of their question-answering model using Amazon Bedrock's model evaluation feature. They want to assess both the factual accuracy and the fluency of the generated answers. Which THREE metrics should they choose? (Select THREE.)

Select 3 answers

A.ROUGE

B.BLEU

C.BERTScore

D.Human evaluation

E.Exact match

AnswersA, C, D

ROUGE measures n-gram recall, useful for factual overlap in QA.

Why this answer

ROUGE measures overlap for summarization/QA; BERTScore measures semantic similarity; and human evaluation is the gold standard for fluency and factual accuracy. BLEU is for translation; exact match is too strict for open-ended QA.

Full explanation →

831

MCQmedium

An organization needs to generate high-quality images from text prompts for a marketing campaign. They require the ability to edit specific regions of an image (inpainting) and extend images beyond their original boundaries (outpainting). Which AWS service or model should they choose?

A.Stable Diffusion XL via Amazon Bedrock

B.Amazon Rekognition

C.Amazon Titan Image Generator

D.Amazon SageMaker JumpStart with a custom GAN

AnswerC

Titan Image Generator includes features for inpainting and outpainting.

Why this answer

Amazon Titan Image Generator supports inpainting and outpainting natively. Stability AI models on Bedrock primarily generate images from scratch without these editing capabilities.

Full explanation →

832

MCQeasy

A company wants to use a foundation model to automatically summarize lengthy documents. Which capability of foundation models is being utilized?

A.Text generation

B.Sentiment analysis

C.Text classification

D.Machine translation

AnswerA

Summarization is a form of text generation where the model produces concise output.

Why this answer

Summarization is a text generation task where the model produces a concise version of the original content. Foundation models (e.g., GPT, Claude) are pre-trained on vast corpora and can generate coherent summaries by predicting the next tokens conditioned on the input document. This directly utilizes the text generation capability, not classification or translation.

Exam trap

The AIF-C01 exam often tests the distinction between text generation and text classification, so the trap here is that candidates may confuse summarization (a generative task) with classification or analysis tasks, especially when the question emphasizes 'understanding' the document rather than 'producing' new text.

How to eliminate wrong answers

Option B (Sentiment analysis) is wrong because it involves classifying the emotional tone of text (positive, negative, neutral), not generating a summary. Option C (Text classification) is wrong because it assigns predefined labels or categories to text, whereas summarization requires generating new text. Option D (Machine translation) is wrong because it converts text from one language to another, not condensing content within the same language.

Full explanation →

833

MCQmedium

Refer to the exhibit. An AWS CloudTrail log shows the creation of an IAM policy for a SageMaker execution role. Which responsible AI concern does this configuration raise?

A.Insufficient training data

B.Lack of least privilege access control

C.Violation of data residency requirements

D.Absence of model monitoring

AnswerB

The wildcard resource exposes all endpoints to potential misuse.

Why this answer

The policy allows sagemaker:InvokeEndpoint on all resources (*), violating the principle of least privilege. This could allow the role to invoke any SageMaker endpoint, potentially leading to unauthorized inferences. Model monitoring, training data, and data residency are not addressed by this log entry.

Full explanation →

834

Multi-Selecteasy

Which TWO of the following are types of unsupervised learning? (Select TWO.)

Select 2 answers

A.Classification

B.Dimensionality reduction

C.Clustering

D.Reinforcement learning

E.Regression

AnswersB, C

Dimensionality reduction reduces the number of features without using labels.

Why this answer

Clustering and dimensionality reduction are both unsupervised learning tasks, as they do not use labeled data.

Full explanation →

835

MCQmedium

A company uses Amazon Bedrock to generate code snippets for internal tools. They notice that the generated code often contains security vulnerabilities such as SQL injection and cross-site scripting. The security team has compiled a comprehensive list of secure coding guidelines and examples of vulnerable patterns. The development team wants to reduce vulnerabilities without significantly slowing down the code generation process. They have tried adding the guidelines to the system prompt, but the model still produces insecure code occasionally. The team is considering additional measures. Which action should they take to most effectively eliminate security vulnerabilities in the generated code?

A.Implement a post-processing step using Amazon CodeGuru or a similar static analysis tool to scan the generated code for vulnerabilities and reject or fix insecure code.

B.Use a larger, more expensive foundation model that specializes in code generation.

C.Include the complete secure coding guidelines in every prompt.

D.Increase the temperature parameter of the foundation model to promote more diverse outputs.

AnswerA

Correct: Post-processing with static analysis reliably catches vulnerabilities and can be automated without slowing down generation significantly.

Why this answer

Option A is correct because it introduces a deterministic, post-generation validation layer that catches vulnerabilities the model might miss. Amazon CodeGuru Reviewer or similar static analysis tools can scan generated code for patterns like SQL injection and XSS, then reject or fix insecure code without modifying the generation process itself. This approach directly addresses the security team's guidelines while maintaining generation speed, as the model's inference latency is unaffected.

Exam trap

AWS often tests the misconception that prompt engineering alone can fully control model output, when in reality, deterministic post-processing steps are required to enforce strict security or compliance requirements.

How to eliminate wrong answers

Option B is wrong because using a larger, more expensive foundation model does not guarantee elimination of security vulnerabilities; all models can produce insecure code, and size does not correlate with adherence to specific security guidelines. Option C is wrong because including the complete secure coding guidelines in every prompt increases token usage and may cause the model to ignore or truncate the guidelines, leading to inconsistent results and slower generation due to longer prompts. Option D is wrong because increasing the temperature parameter promotes more diverse and random outputs, which would likely increase the probability of generating insecure code rather than reducing it.

Full explanation →

836

MCQmedium

A company uses Amazon Bedrock to generate marketing copy. They want to evaluate the quality of generated text against human-written reference texts using automated metrics. Which metric measures the overlap of n-grams between generated and reference text?

A.Perplexity

B.BLEU

C.ROUGE

D.BERTScore

AnswerC

ROUGE (Recall-Oriented Understudy for Gisting Evaluation) measures n-gram overlap.

Why this answer

ROUGE measures n-gram overlap and is commonly used for summarisation. BLEU is for translation, BERTScore uses embeddings, and Perplexity measures language model confidence.

Full explanation →

837

MCQeasy

A startup needs to predict customer churn based on historical data containing labels (churned or not). Which type of machine learning should they use?

A.Reinforcement learning

B.Unsupervised learning

C.Supervised learning

D.Semi-supervised learning

AnswerC

Since the data has labels, supervised learning is appropriate for classification.

Why this answer

The startup has labeled historical data (churned or not), which is the defining characteristic of supervised learning. The goal is to learn a mapping from input features to the known output labels to predict churn for new customers. This is a classic classification problem, making supervised learning the correct choice.

Exam trap

The AIF-C01 exam often tests the distinction between supervised and unsupervised learning by presenting a scenario with labeled data, where candidates might mistakenly choose unsupervised learning if they overlook the presence of labels.

How to eliminate wrong answers

Option A is wrong because reinforcement learning involves an agent learning through trial-and-error interactions with an environment to maximize cumulative reward, not from labeled historical data. Option B is wrong because unsupervised learning finds hidden patterns or structures in unlabeled data, but here the labels (churned/not) are explicitly provided. Option D is wrong because semi-supervised learning uses a small amount of labeled data with a large amount of unlabeled data, but the problem states the historical data contains labels, implying fully labeled data is available.

Full explanation →

838

MCQhard

A company uses Amazon Rekognition to detect objects in images. They notice that the model frequently misidentifies a rare object as a common background item. What is the most likely cause?

A.Insufficient training epochs

B.Class imbalance in the training data

C.Overfitting on the rare object

D.The confidence threshold is set too high

AnswerB

Rare objects are underrepresented; the model learns to predict the majority class to minimize loss.

Why this answer

Class imbalance in the training data (rare objects underrepresented) causes the model to bias toward the majority class, leading to misclassification of rare objects.

Full explanation →

839

MCQhard

A deployed model on an Amazon SageMaker endpoint is experiencing high inference latency (average 500ms) during peak hours. The model is a deep neural network with 10 million parameters. The endpoint uses a single ml.c5.xlarge instance. The company wants to reduce latency to under 200ms without retraining or changing the model architecture. Which action should they take?

A.Enable automatic scaling to add more instances

B.Switch to a GPU-based instance type like ml.p2.xlarge

C.Deploy the model on a multi-model endpoint

D.Use SageMaker Neo to compile and optimize the model

AnswerD

SageMaker Neo optimizes models for target hardware, significantly reducing inference latency without changing the model.

Why this answer

SageMaker Neo compiles trained models into an optimized format for the target hardware, reducing inference latency without altering the model architecture. For a deep neural network with 10 million parameters on a CPU instance, Neo applies hardware-specific optimizations like operator fusion and memory layout tuning, which can significantly lower latency. This directly addresses the requirement to reduce latency from 500ms to under 200ms without retraining or changing the model.

Exam trap

AWS often tests the misconception that scaling or switching to GPU is the default solution for latency issues, but the trap here is that the question explicitly prohibits retraining or architecture changes, making model compilation via SageMaker Neo the only viable option that directly optimizes inference speed on the existing hardware.

How to eliminate wrong answers

Option A is wrong because automatic scaling adds more instances to handle increased request volume, but it does not reduce per-request latency; it distributes load but each request still processes on a single instance with the same inference time. Option B is wrong because switching to a GPU instance like ml.p2.xlarge may accelerate certain model types but does not guarantee latency reduction for a deep neural network with 10 million parameters, and it introduces higher cost and potential overhead from GPU initialization; the requirement is to reduce latency without retraining or architecture changes, and GPU acceleration often requires model adaptation. Option C is wrong because deploying on a multi-model endpoint is designed to host multiple models on a single endpoint to improve resource utilization, not to reduce inference latency for a single model; it adds container management overhead that could increase latency.

Full explanation →

840

MCQmedium

A company is building a search application that retrieves relevant documents based on semantic meaning rather than exact keyword matches. Which combination of services would BEST enable this capability?

A.Amazon Titan Embeddings model and a vector database

B.Amazon Bedrock and Amazon Kendra

C.Amazon Lex and Amazon ElastiCache

D.Amazon Comprehend and Amazon DynamoDB

AnswerA

Titan Embeddings creates vectors from text, and a vector database enables similarity search for retrieval.

Why this answer

Amazon Titan Embeddings converts text into vector embeddings, and a vector database (like Amazon OpenSearch Serverless or Aurora with pgvector) stores and retrieves these embeddings for semantic search.

Full explanation →

841

Multi-Selectmedium

A data scientist has trained a random forest model that achieves 92% accuracy on the training set but only 75% on the test set. The dataset has 1000 samples and 20 features. Which THREE actions could help improve the model's generalization? (Select THREE.)

Select 3 answers

A.Increase the number of trees in the forest

B.Increase the maximum depth of each tree

C.Reduce the maximum depth of each tree

D.Decrease the number of trees in the forest

E.Increase the minimum number of samples required to split an internal node

AnswersA, C, E

More trees typically improve stability and generalization.

Why this answer

Increasing the number of trees can reduce variance but may help if the forest is too small; however, for random forests, more trees generally improve generalization slightly without overfitting. Reducing max depth limits tree complexity, reducing overfitting. Increasing min samples split prevents leaves from being too specific.

Decreasing the number of trees would reduce model capacity and could worsen fit. Increasing max depth would increase overfitting.

Full explanation →

842

Multi-Selectmedium

A company is building a RAG solution using Amazon Bedrock Knowledge Bases. Which TWO steps are essential in the document ingestion pipeline? (Select TWO.)

Select 2 answers

A.Setting up a model endpoint for real-time inference

B.Creating a Bedrock Guardrail for the documents

C.Generating embeddings for each chunk

D.Chunking documents into smaller segments

E.Fine-tuning the model on the documents

AnswersC, D

Embeddings are needed for vector search to find relevant chunks.

Why this answer

Document ingestion for RAG typically involves chunking documents into smaller pieces and generating embeddings for each chunk to enable vector search.

Full explanation →

843

Multi-Selecteasy

A company wants to log all model invocation requests in Amazon Bedrock for audit and troubleshooting. Which TWO destinations can they configure for invocation logging? (Choose 2)

Select 2 answers

A.Amazon CloudWatch Logs

B.Amazon S3

C.Amazon DynamoDB

D.AWS CloudTrail

E.Amazon Kinesis Data Firehose

AnswersA, B

Invocation logs can be sent to CloudWatch Logs.

Why this answer

Bedrock invocation logging supports S3 and CloudWatch Logs as destinations.

Full explanation →

844

MCQhard

An e-commerce company uses Amazon Personalize to provide product recommendations. The business team observes that the recommendations are dominated by popular items and rarely suggest niche products, even for users with long purchase histories. Which Personalize recipe or configuration change would BEST address this issue?

A.Increase the minimum interaction threshold for item inclusion

B.Decrease the learning rate for the model

C.Switch from aws-user-personalization recipe to aws-popularity-count recipe

D.Use the aws-user-personalization recipe and enable the explore-holdoff feature

AnswerD

This recipe includes automatic popularity-bias reduction; explore-holdoff can further encourage exploration of less popular items.

Why this answer

The user-personalization recipe (aws-user-personalization) uses a popularity-bias reduction mechanism that tends to recommend more personalized and diverse items compared to the popularity-count recipe. The popularity-count recipe explicitly recommends popular items. The other options do not directly affect diversity or personalization.

Full explanation →

845

MCQhard

A data scientist is using Amazon Bedrock to generate product descriptions. The current prompt produces inconsistent results; sometimes the descriptions are too verbose, other times too short. The scientist wants to reduce output variability and set a consistent tone. Which combination of parameters should be adjusted?

A.Set temperature to 0 and top-p to 0.5

B.Increase max tokens to 500 and set stop sequences

C.Lower temperature to 0.1 and set top-p to 0.9

D.Increase temperature to 0.9 and set top-p to 1.0

AnswerC

Low temperature (0.1) reduces randomness, and moderate top-p (0.9) limits token sampling to high-probability tokens, producing more consistent outputs.

Why this answer

Lowering temperature reduces randomness, making outputs more deterministic. Top-p (nucleus sampling) also controls diversity; setting it lower further focuses on high-probability tokens. Together they stabilize output length and style.

Full explanation →

846

MCQeasy

Which component of the Transformer architecture allows the model to weigh the importance of different tokens in the input sequence when generating each output token?

A.Layer normalization

B.Feed-forward network

C.Positional encoding

D.Self-attention mechanism

AnswerD

Self-attention computes attention scores between all token pairs, allowing the model to dynamically weigh token importance.

Why this answer

Self-attention computes attention scores between every pair of tokens, enabling the model to focus on relevant parts of the input.

Full explanation →

847

Multi-Selectmedium

A company wants to use Amazon Bedrock to generate product images. They need to control the style (e.g., watercolor, oil painting) and ensure the images are safe for work (no inappropriate content). Which TWO features should they use? (Select TWO.)

Answer options not yet available.

Why this answer

A style prompt controls the artistic style, and content filtering ensures safe outputs.

Full explanation →

848

MCQmedium

A developer is building an agent using Amazon Bedrock Agents. The agent needs to call an external API to retrieve weather data. What must the developer define to enable this capability?

A.A Knowledge Base with weather documents

B.An action group with an OpenAPI schema and a Lambda function

C.A Bedrock Guardrail to filter the weather data

D.A prompt flow in Bedrock Studio

AnswerB

Action groups define the API operations and the Lambda that executes them.

Why this answer

Action groups encapsulate the API schema and Lambda function that the agent uses to invoke external services. Lambda functions handle the actual API call logic.

Full explanation →

849

MCQhard

A company is deploying a real-time chatbot using Amazon Bedrock and expects high traffic during business hours. They want to minimise inference costs while maintaining low latency. Which combination of strategies would be MOST effective?

A.Increase the context length to handle more conversations per prompt

B.Enable model caching and use a smaller, faster foundation model

C.Use batch inference for all requests and provision a large model

D.Disable content filtering to reduce processing time

AnswerB

Caching reduces redundant compute, and a smaller model reduces per-request cost while meeting latency needs.

Why this answer

Model caching can serve repeated queries quickly without re-invocation, and batch inference is cost-effective for non-real-time workloads but not for real-time. Right-sizing means choosing a model that balances cost and performance.

Full explanation →

850

MCQeasy

A company wants to monitor for malicious activity in their machine learning pipelines, such as unauthorized access to training data or model artifacts. Which AWS service can provide automated threat detection and continuous monitoring?

A.AWS Config

B.Amazon GuardDuty

C.AWS Shield

D.Amazon Inspector

AnswerB

GuardDuty continuously monitors for malicious activity across AWS accounts and workloads.

Why this answer

Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior across AWS workloads, including machine learning pipelines. It uses machine learning, anomaly detection, and integrated threat intelligence to identify threats such as unauthorized access to S3 buckets containing training data or model artifacts, without requiring manual intervention.

Exam trap

AWS often tests the distinction between services that monitor for security threats (GuardDuty) versus services that manage compliance (AWS Config), protect against DDoS (AWS Shield), or scan for vulnerabilities (Amazon Inspector), leading candidates to confuse configuration auditing with active threat detection.

How to eliminate wrong answers

Option A is wrong because AWS Config is a service for evaluating and auditing resource configurations against compliance rules, not for continuous threat detection or monitoring for malicious activity. Option C is wrong because AWS Shield is a managed Distributed Denial of Service (DDoS) protection service, designed to safeguard against network and transport layer attacks, not for detecting unauthorized access or malicious behavior in ML pipelines. Option D is wrong because Amazon Inspector is a vulnerability management service that scans for software vulnerabilities and unintended network exposure, not for real-time threat detection or monitoring of malicious activity.

Full explanation →

851

MCQeasy

Which AWS service can convert a text document from one language to another?

A.Amazon Polly

B.Amazon Textract

C.Amazon Comprehend

D.Amazon Translate

AnswerD

Translate is the correct service for language translation.

Why this answer

Amazon Translate is a neural machine translation service for translating text between languages.

Full explanation →

852

MCQeasy

An e-commerce company uses a foundation model to generate personalized email subject lines. The marketing team notices that the subject lines sometimes contain product recommendations that are out of stock. Which action would best reduce the generation of out-of-stock recommendations without retraining the model?

A.Implement a post-processing step to replace out-of-stock recommendations with in-stock alternatives.

B.Fine-tune the model on a dataset of past successful subject lines that only include in-stock products.

C.Add a system prompt that explicitly instructs the model to only recommend products that are in stock.

D.Use a retrieval-augmented generation (RAG) approach to retrieve a list of in-stock products and include it in the prompt.

AnswerC

A system prompt can constrain the model's output to follow the instruction, reducing unwanted recommendations.

Why this answer

Option C is correct because adding a system prompt that explicitly instructs the model to only recommend in-stock products directly constrains the model's output at inference time without requiring retraining. This leverages the model's instruction-following capability to filter its generated content based on the provided context, which is a lightweight and immediate solution.

Exam trap

AWS often tests the distinction between inference-time interventions (like prompt engineering) and training-time interventions (like fine-tuning), and the trap here is that candidates may confuse RAG (which retrieves external data but does not enforce constraints) with a system prompt that directly instructs the model, leading them to select D instead of C.

How to eliminate wrong answers

Option A is wrong because post-processing replacement of out-of-stock recommendations with in-stock alternatives is reactive and may introduce irrelevant or incorrect substitutions, failing to prevent the model from generating out-of-stock items in the first place. Option B is wrong because fine-tuning the model requires retraining on a new dataset, which contradicts the question's constraint of 'without retraining the model.' Option D is wrong because while RAG can retrieve a list of in-stock products, including it in the prompt does not guarantee the model will exclusively recommend those items; the model may still generate out-of-stock recommendations from its parametric knowledge, especially if the prompt is not strictly enforced.

Full explanation →

853

MCQeasy

A social media company needs to automatically detect and flag toxic comments in multiple languages. They have a large stream of user comments and require real-time moderation. Which AWS service is best suited for this task?

A.Amazon Lex

B.Amazon Comprehend

C.Amazon Rekognition

D.Amazon Translate

AnswerB

Amazon Comprehend provides built-in sentiment analysis and toxic content detection in multiple languages, suitable for real-time text analysis.

Why this answer

Amazon Comprehend is the correct choice because it is a natural language processing (NLP) service that can perform real-time toxicity detection across multiple languages using its built-in content moderation and custom classification capabilities. It analyzes text streams to identify toxic comments (e.g., hate speech, threats) and integrates with AWS streaming services like Amazon Kinesis for real-time processing.

Exam trap

The trap here is that candidates may confuse Amazon Comprehend's NLP capabilities with Amazon Lex's conversational AI or Amazon Translate's language translation, assuming any language-related service can detect toxicity, but only Comprehend provides the specific text analysis APIs for content moderation.

How to eliminate wrong answers

Option A is wrong because Amazon Lex is a service for building conversational interfaces (chatbots) using automatic speech recognition (ASR) and natural language understanding (NLU), not for analyzing text for toxicity. Option C is wrong because Amazon Rekognition is designed for image and video analysis (e.g., object detection, facial recognition), not for processing text comments. Option D is wrong because Amazon Translate is a machine translation service that converts text between languages but does not perform toxicity detection or content moderation.

Full explanation →

854

Multi-Selecteasy

A data science team is building a resume screening model and wants to ensure it does not exhibit gender bias. Which TWO actions are most effective for mitigating bias? (Choose TWO.)

Select 2 answers

A.Apply adversarial debiasing techniques during training.

B.Use a more complex deep learning model.

C.Remove the gender attribute and all correlated features from the dataset.

D.Regularly audit model predictions for disparate impact across genders.

E.Ensure the training dataset has equal numbers of male and female candidates.

AnswersA, D

Adversarial debiasing reduces sensitivity to protected attributes.

Why this answer

Regularly auditing predictions for disparate impact and applying adversarial debiasing are proven techniques. Simply removing attributes may not eliminate bias due to correlated proxies. Balancing datasets is helpful but not sufficient alone.

Complex models do not guarantee fairness.

Full explanation →

855

MCQhard

A financial services company uses a machine learning model to automatically reject credit card transactions suspected of fraud. The model was trained on transaction data from the past two years. Over the last three months, the model's false positive rate has increased significantly, causing legitimate transactions to be declined and leading to customer complaints. The company needs to restore the model's accuracy quickly. Initial analysis shows that the distribution of transaction amounts and locations has shifted compared to the training period. The data science team is under pressure to deploy an update within a week. Which approach should they take to most effectively address the issue while adhering to responsible AI guidelines?

A.Deploy a rule-based system with fixed rules for fraud detection

B.Adjust the decision threshold to reduce false positives without retraining

C.Retrain the model using only the most recent three months of transaction data and evaluate on current distribution

D.Build an ensemble model that combines predictions from the old model and a new model trained on recent data

AnswerC

Retraining on recent data adapts to drift and is straightforward.

Why this answer

The most effective approach is to retrain the model using recent data (last three months) to adapt to the distribution shift, and carefully evaluate for any new biases that may emerge. This directly addresses the drift. Simply adjusting the threshold may not capture new fraud patterns.

Using an ensemble of old and recent models could be complex and may not fully adapt. Deploying a simple rule-based system would be a step backward in capability.

Full explanation →

856

MCQhard

A data scientist observes that a gradient boosting model's performance on the validation set is significantly worse than on the training set. Which adjustment is MOST likely to reduce this gap?

A.Increase the maximum depth of trees

B.Reduce the learning rate and increase the number of estimators

C.Increase the number of features

D.Increase the subsample ratio to 1.0

AnswerB

A lower learning rate makes the model more robust, and more estimators compensate, often reducing overfitting.

Why this answer

Reducing the learning rate and increasing the number of estimators is a common regularization technique that can reduce overfitting in gradient boosting.

Full explanation →

857

MCQmedium

A data scientist is using Amazon SageMaker to train a deep learning model. The training job fails with a 'ResourceLimitExceeded' error. What is the MOST likely cause of this error?

A.The account has reached the limit for concurrent training jobs or instance usage

B.The training data contains corrupted files

C.The model is too large for the chosen instance type

D.The training script has a syntax error

AnswerA

AWS enforces service limits; exceeding them triggers this error.

Why this answer

ResourceLimitExceeded typically means the AWS account has reached a service limit, such as the maximum number of SageMaker training instances allowed. Other causes like insufficient memory or data corruption yield different errors.

Full explanation →

858

MCQeasy

A data scientist wants to use a third-party foundation model from Amazon Bedrock for a generative AI application. The compliance officer needs to understand how the third-party model provider handles data privacy. Where can the data scientist find this information?

A.Contact the model provider directly via email

B.In the Amazon Bedrock service documentation for the specific model provider

C.Use Amazon Macie to analyze the model provider's privacy policy

D.In the AWS Artifact reports

AnswerB

Bedrock documentation includes data privacy information for each model provider, as required by AWS.

Why this answer

AWS provides a service-specific data privacy page for each Bedrock model provider, detailing data handling, retention, and privacy practices. This is part of the Bedrock documentation.

Full explanation →

859

Multi-Selecthard

A data scientist is fine-tuning a foundation model on Amazon Bedrock for a custom summarization task. Which THREE practices should they follow to optimize the fine-tuning process?

Select 3 answers

A.Start with a base model that is already strong in the domain.

B.Use the default hyperparameters without tuning.

C.Use a representative dataset that reflects the target task.

D.Monitor training loss and validation loss to avoid overfitting.

E.Train for as many epochs as possible.

AnswersA, C, D

A good base model reduces training time and improves results.

Why this answer

Starting with a base model that is already strong in the domain (Option A) is correct because it reduces the amount of fine-tuning data and compute required. Amazon Bedrock provides access to various foundation models (e.g., Anthropic Claude, Amazon Titan) that have been pre-trained on diverse corpora; selecting one that is already proficient in the target domain (e.g., legal or medical summarization) means the model's existing knowledge can be adapted with fewer training steps, leading to better performance and lower risk of catastrophic forgetting.

Exam trap

The AIF-C01 exam often tests the misconception that more epochs always improve model performance, when in fact excessive training leads to overfitting, and they expect candidates to recognize that monitoring loss curves and using early stopping are critical practices.

Full explanation →

860

MCQmedium

A company is implementing a Retrieval-Augmented Generation (RAG) pipeline with Amazon Bedrock Knowledge Bases. They need to store vector embeddings for their documents. Which vector store options are natively supported by Bedrock Knowledge Bases?

A.Amazon OpenSearch Serverless, Pinecone, MongoDB Atlas, and Amazon Aurora pgvector

B.Only Pinecone and Amazon OpenSearch Serverless

C.Amazon DynamoDB and Amazon RDS for MySQL

D.Only Amazon OpenSearch Serverless

AnswerA

All four are supported vector store options for Bedrock Knowledge Bases.

Why this answer

Bedrock Knowledge Bases natively integrates with several vector stores, including Amazon OpenSearch Serverless, Pinecone, and MongoDB Atlas. Amazon Aurora pgvector is also supported but through Aurora's PostgreSQL compatibility. All four are valid options.

Full explanation →

861

MCQmedium

A healthcare company is deploying a machine learning model on Amazon SageMaker to analyze patient records. The model requires access to a DynamoDB table containing patient data. Which combination of AWS services and features should the company use to restrict access to only the necessary resources?

A.Attach a DynamoDB resource-based policy to the table allowing access from the SageMaker notebook

B.Create an IAM role with a policy granting read-only access to the specific DynamoDB table and attach it to the SageMaker notebook instance

C.Store AWS access keys in the notebook and use those credentials to access DynamoDB

D.Launch the SageMaker notebook in a VPC with a security group that allows access to DynamoDB

AnswerB

This follows least-privilege principle and uses temporary credentials via IAM roles.

Why this answer

Option B is correct because it follows the AWS principle of least privilege by creating an IAM role with a policy that grants read-only access to the specific DynamoDB table, then attaching that role to the SageMaker notebook instance. This ensures the notebook can only perform read operations on the required table without exposing long-term credentials or granting broader permissions.

Exam trap

The AIF-C01 exam often tests the misconception that DynamoDB supports resource-based policies like S3 bucket policies, but in reality DynamoDB only uses IAM identity-based policies for access control.

How to eliminate wrong answers

Option A is wrong because DynamoDB does not support resource-based policies; access control is managed exclusively through IAM policies, not by attaching policies directly to the table. Option C is wrong because storing AWS access keys in the notebook violates security best practices by introducing long-term credentials that can be leaked or misused, and SageMaker notebooks should use IAM roles for temporary credentials. Option D is wrong because a VPC with a security group controls network-level traffic but does not authenticate or authorize the SageMaker notebook to access DynamoDB; DynamoDB access requires IAM permissions regardless of network configuration.

Full explanation →

862

MCQmedium

A data scientist is evaluating foundation models for a text summarization task and wants to use a standard metric. Which metric is commonly used to assess the quality of generated summaries?

A.F1 score

B.ROUGE

C.BLEU

D.Accuracy

AnswerB

ROUGE measures recall-based overlap for summaries.

Why this answer

ROUGE is the standard metric for summarization, measuring overlap of n-grams. Option A (Accuracy) is for classification. Option C (BLEU) is for translation.

Option D (F1 score) is for classification.

Full explanation →

863

MCQhard

A media company uses Amazon Bedrock to generate image captions. They notice that the output quality degrades when the input image contains text in non-Latin scripts. Which model type is MOST likely being used, and what is the likely cause?

A.Anthropic Claude 3 Sonnet; its OCR performance on non-Latin scripts is limited

B.Amazon Titan Image Generator; it is not designed for caption generation

C.Meta Llama 3 70B; it does not accept image inputs

D.Anthropic Claude 3 Sonnet; the image resolution is too low for its vision encoder

AnswerA

Claude 3 Sonnet is multimodal and can caption images, but its OCR for non-Latin scripts may be weak, leading to degraded caption quality.

Why this answer

Claude 3 Sonnet is a multimodal model that can process images and text; however, its OCR capability for non-Latin scripts may be limited. Titan Image Generator only generates images, not captions. Llama 3 is text-only.

The degradation is not due to resolution but due to the model's underlying OCR training data.

Full explanation →

864

MCQhard

A healthcare company is using Amazon SageMaker to train and deploy a model that predicts patient readmission risk. The model uses sensitive protected health information (PHI). The company must ensure that data is encrypted at rest and in transit, and that access to the model endpoint is restricted to authorized applications only. The security team has configured AWS KMS customer managed keys for encryption, and IAM roles for SageMaker execution. However, during a security audit, it was discovered that the model endpoint is accessible from the internet and that the data used for training was stored in an S3 bucket with default encryption enabled. The compliance team requires that all PHI data be encrypted with a key that is rotated annually, and that no public access is allowed to the endpoint or training data. Which combination of actions should the ML engineer take to remediate these issues?

A.Use a SageMaker notebook instance with a lifecycle configuration to encrypt data with a customer managed KMS key, and restrict endpoint access using an IAM policy.

B.Enable S3 bucket encryption with SSE-S3, attach a bucket policy denying public access, and use an AWS Lambda function to rotate the S3 bucket key every year.

C.Apply SSE-KMS with an AWS managed key to the S3 bucket, and use a Lambda function to rotate the key every year. Disable public access to the endpoint using a VPC endpoint.

D.Enable S3 bucket encryption with a customer managed KMS key, disable public access on the SageMaker endpoint by deploying it in a VPC, and configure the KMS key to rotate annually.

AnswerD

Correct: Addresses all requirements with customer managed key, VPC endpoint, and key rotation.

Why this answer

Option D is correct because it addresses all compliance requirements: enabling S3 bucket encryption with a customer managed KMS key ensures PHI is encrypted at rest with a key that can be rotated annually, deploying the SageMaker endpoint in a VPC removes public internet access, and configuring annual KMS key rotation satisfies the rotation policy. This combination ensures encryption at rest and in transit (via VPC), restricts endpoint access to authorized applications only, and meets the key rotation requirement.

Exam trap

The trap here is that candidates confuse 'disabling public access' with 'using a VPC endpoint'—a VPC endpoint only allows private access to the endpoint from within the VPC, but the endpoint itself remains publicly accessible unless it is deployed inside a VPC with no internet gateway.

How to eliminate wrong answers

Option A is wrong because a SageMaker notebook instance with a lifecycle configuration does not encrypt data at rest in S3 or the endpoint, and restricting endpoint access via an IAM policy alone does not prevent public internet access—network-level controls like VPC are required. Option B is wrong because SSE-S3 uses AWS-managed keys that cannot be rotated annually by the customer, and a Lambda function cannot rotate an SSE-S3 key (S3 manages it automatically); also, it does not address endpoint public access. Option C is wrong because using an AWS managed key (SSE-KMS with AWS managed key) does not allow customer-controlled annual rotation—only customer managed KMS keys support customer-initiated rotation; additionally, disabling public access to the endpoint via a VPC endpoint is insufficient—the endpoint itself must be deployed in a VPC to remove internet exposure.

Full explanation →

865

MCQhard

A company is using Amazon Bedrock to run batch inference on millions of customer support transcripts for sentiment analysis. Which approach is MOST cost‑effective and fastest for this workload?

A.Use Bedrock batch inference to process the transcripts asynchronously

B.Use the Bedrock real‑time API with on‑demand throughput

C.Enable model caching on the real‑time API to reuse previous responses

D.Use the Bedrock real‑time inference API with a provisioned throughput model

AnswerA

Batch inference processes large datasets asynchronously, optimizing cost and time by batching requests without needing always‑on capacity.

Why this answer

Batch inference processes large volumes asynchronously, reducing costs compared to real‑time calls. Model caching can help if the same inputs recur, but batch is more efficient for one‑time processing. Provisioned throughput is expensive for sporadic large batches.

Realtime API is costly at scale.

Full explanation →

866

Multi-Selectmedium

A company is using Amazon Bedrock to generate marketing copy. They want to ensure the output is safe and appropriate. Which TWO actions should they take? (Choose 2.)

Select 2 answers

A.Enable content filtering with guardrails

B.Set temperature to 0 for deterministic output

C.Use model fine-tuning with unsafe examples

D.Use a private endpoint for Bedrock

E.Implement human review of all generated content

AnswersA, E

Guardrails can block harmful or inappropriate content automatically.

Why this answer

Options A and D are correct. Guardrails filter content in real time, and human review catches subtle issues. Option B (fine-tuning with unsafe examples) could introduce bias.

Option C (low temperature) reduces creativity but does not ensure safety. Option E (private endpoint) addresses networking, not content safety.

Full explanation →

867

MCQeasy

A developer wants to generate product description images using Amazon Bedrock. They need to ensure the generated images match a specific brand style. Which feature should they primarily use?

A.Prompt engineering with detailed style descriptions.

B.Output grounding to verify brand compliance.

C.Data augmentation to increase dataset diversity.

D.Fine-tuning the image generation model on brand assets.

AnswerA

Prompt engineering is the simplest way to steer image generation toward a desired style.

Why this answer

Option A is correct because prompt engineering allows the developer to specify style guidelines in the text prompt, influencing the output. Option B is wrong because fine-tuning for image style is time-consuming. Option C is wrong because grounding is for text, not images.

Option D is wrong because data augmentation is not directly relevant.

Full explanation →

868

MCQeasy

What is the primary purpose of chunking in a Retrieval-Augmented Generation (RAG) pipeline?

A.To ensure each document segment is small enough to be meaningfully embedded and retrieved

B.To reduce the number of API calls to the embedding model

C.To encrypt the documents before embedding

D.To train the embedding model on domain-specific data

AnswerA

Correct. Chunking enables precise retrieval and prevents truncation of relevant content.

Why this answer

Chunking splits large documents into smaller, manageable pieces that can be individually embedded and retrieved. This ensures that the retrieved context is focused and fits within the model's context window.

Full explanation →

869

MCQhard

An organization wants to use Amazon Rekognition to analyze images of people for a security application. They must comply with GDPR. What is the best practice?

A.Store images indefinitely for audit

B.Use celebrity recognition

C.Ensure all images are anonymized before analysis

D.Use face detection only

AnswerC

Anonymizing images (e.g., blurring faces) helps comply with privacy regulations like GDPR.

Why this answer

Option C is correct because GDPR requires that personal data, including facial images, be processed lawfully and with appropriate safeguards. Anonymizing images before analysis with Amazon Rekognition ensures that the data cannot be linked back to an identifiable person, thereby reducing GDPR compliance risk. This aligns with the principle of data minimization and privacy by design.

Exam trap

AWS often tests the misconception that using a specific feature like celebrity recognition or face detection alone automatically satisfies compliance requirements, when in fact GDPR mandates data anonymization or pseudonymization as a best practice for processing biometric data.

How to eliminate wrong answers

Option A is wrong because storing images indefinitely violates GDPR's data retention limitation principle, which mandates that personal data be kept no longer than necessary for the processing purpose. Option B is wrong because celebrity recognition is designed to identify known public figures and does not address GDPR compliance for general image analysis; it may still process personal data without anonymization. Option D is wrong because face detection alone still processes biometric data that can be used to identify individuals, and without anonymization, it does not meet GDPR requirements for lawful processing.

Full explanation →

870

MCQmedium

A machine learning team is training a model using Amazon SageMaker with data stored in an S3 bucket. The security policy requires that all data be encrypted at rest and in transit, and that the training job cannot access the internet. Which combination of settings should the team use?

A.Enable Network Isolation and use an S3 VPC Endpoint

B.Disable Network Isolation, place the training job in a private VPC subnet, and use an S3 VPC Endpoint

C.Enable Network Isolation and use a VPC with a NAT gateway

D.Disable Network Isolation and use a public VPC subnet

AnswerB

This allows the job to access S3 via the VPC endpoint without internet access, meeting the no-internet requirement.

Why this answer

SageMaker training jobs can be run in a VPC without internet access by disabling 'Enable Network Isolation' (which actually blocks all network access) and using a VPC with no NAT gateway or internet gateway. KMS encryption for both S3 and the attached ML storage volume ensures encryption at rest.

Full explanation →

871

MCQmedium

A developer is using Amazon Bedrock Knowledge Bases to build a Q&A chatbot. The knowledge base contains PDF documents that are ingested and chunked. After ingestion, the chatbot sometimes returns irrelevant answers. What is the most likely cause?

A.The chunking size is too large, causing chunks to contain multiple topics

B.The embedding model is too large

C.The vector store index type is incorrect

D.The LLM temperature is set too low

AnswerA

Large chunks dilute relevance; smaller, focused chunks improve retrieval precision.

Why this answer

If chunking size is too large, each chunk may contain multiple topics, causing the vector search to retrieve chunks that are not precisely relevant to the query.

Full explanation →

872

MCQmedium

A developer invoked an Amazon Bedrock model and received the following error: 'ValidationException: 1 validation error detected: Value 'claude-instant-v1' at 'modelId' failed to satisfy constraint: Member must satisfy enum value set: [ai21.j2-mid-v1, amazon.titan-text-lite-v1, anthropic.claude-v2, ...]'. What is the likely cause?

A.The Lambda function does not have the necessary IAM permissions

B.The modelId is not available in the current AWS region

C.The modelId is not part of the allowed enum of models for the account

D.The modelId is deprecated and has been renamed

AnswerC

The error explicitly states the value must satisfy the enum set, meaning the model ID is invalid or not in the allowed list.

Why this answer

Option C is correct because the error indicates the modelId 'claude-instant-v1' is not in the allowed enum set. This is usually because the model ID is incorrectly spelled or not available in this region/account. Option A (deprecated) would give a different message.

Option B (region availability) would mention region. Option D (permissions) would be a different error type.

Full explanation →

873

MCQeasy

An organization wants to document their model's intended use, limitations, performance metrics, and ethical considerations. Which tool or practice is designed specifically for this purpose?

A.Amazon SageMaker Clarify

B.AWS CloudTrail

C.Amazon SageMaker Model Monitor

D.Model cards

AnswerD

Model cards are specifically designed to document model details in a transparent and standardized format.

Why this answer

Model cards are standardized documentation templates that include details like intended use, performance, limitations, and ethical considerations, promoting transparency.

Full explanation →

874

MCQhard

A team is deploying a generative AI model for medical report generation. They must ensure patient data privacy and comply with HIPAA. Which AWS service feature is essential for de-identifying protected health information (PHI) before sending data to a foundation model?

A.AWS CloudHSM

B.Amazon Comprehend Medical

C.Amazon Macie

D.AWS Key Management Service (AWS KMS)

AnswerB

Comprehend Medical provides PHI detection and de-identification.

Why this answer

Amazon Comprehend Medical is the correct service because it is specifically designed to extract and de-identify protected health information (PHI) from unstructured medical text using natural language processing (NLP). It can detect entities such as patient names, dates, and medical record numbers, and then redact or replace them before the data is sent to a foundation model, ensuring HIPAA compliance.

Exam trap

The trap here is that candidates confuse general data protection services like Macie or encryption services like KMS with the specialized PHI de-identification capability of Amazon Comprehend Medical, assuming any security service can handle HIPAA compliance for generative AI workflows.

How to eliminate wrong answers

Option A is wrong because AWS CloudHSM provides hardware security modules (HSMs) for cryptographic key storage and operations, but it does not perform data de-identification or PHI detection. Option C is wrong because Amazon Macie is a data security service that discovers and protects sensitive data using machine learning and pattern matching, but it is designed for data classification and access control, not for de-identifying PHI in unstructured text for downstream AI processing. Option D is wrong because AWS Key Management Service (AWS KMS) manages encryption keys for data at rest and in transit, but it does not have the capability to identify or remove PHI from text content.

Full explanation →

875

MCQeasy

A company uses Amazon Rekognition for facial analysis. They want to ensure the model doesn't exhibit bias based on skin tone. What should they do?

A.Ensure the training dataset includes diverse skin tones

B.Apply data augmentation to increase dataset size

C.Use a larger neural network

D.Use a pre-trained model from AWS Marketplace

AnswerA

Balanced representation mitigates bias.

Why this answer

Option D is correct: Training on diverse data reduces bias. Option A is wrong: Network size does not address bias. Option B is wrong: Data augmentation does not guarantee diversity.

Option C is wrong: Pre-trained models may have inherent bias.

Full explanation →

876

Multi-Selecthard

A data science team is evaluating a text classification model built using Amazon Bedrock. They want to compare its performance against a baseline using automated metrics. Which three metrics are appropriate for evaluating a text classification model? (Choose THREE.)

Select 3 answers

A.BLEU

B.Recall

C.ROUGE

D.Precision

E.Accuracy

AnswersB, D, E

Recall measures the proportion of actual positives correctly identified, key for classification.

Why this answer

Accuracy, precision, recall, and F1-score are standard classification metrics. ROUGE and BLEU are for generation tasks, not classification.

Full explanation →

877

MCQmedium

A developer encounters the error shown above when using Amazon Bedrock. What is the most likely cause?

A.The model is not available in the region

B.The IAM role lacks the required permission

C.The request is throttled

D.The model is out of service

AnswerB

The error explicitly states the role is not authorized for the action.

Why this answer

The error indicates an access denied or authorization failure when invoking the Amazon Bedrock model. The most likely cause is that the IAM role used by the developer does not have the required permission, such as `bedrock:InvokeModel`, attached to its policy. Without this permission, the API call to Bedrock is rejected regardless of model availability or service status.

Exam trap

AWS often tests the distinction between service availability errors and authorization errors, so the trap here is that candidates may confuse a permissions failure with a model unavailability or throttling issue, especially when the error message is generic.

How to eliminate wrong answers

Option A is wrong because if the model were not available in the region, the error would typically be a `ModelNotFoundException` or `ValidationException`, not an access denied error. Option C is wrong because throttling errors return a `ThrottlingException` with HTTP 429 status code, not an authorization error. Option D is wrong because if the model were out of service, the error would be a `ServiceUnavailableException` or `ModelNotReadyException`, not a permissions-related error.

Full explanation →

878

MCQhard

A financial services firm is deploying a loan approval model and must comply with the EU AI Act, which classifies credit scoring as a high-risk AI system. Which combination of actions is required for such high-risk systems under the regulation?

A.Conduct a fundamental rights impact assessment and implement a human-in-the-loop review process

B.Implement Amazon Rekognition to monitor model inputs and outputs

C.Use SageMaker Model Monitor to detect data drift and retrain automatically

D.Obtain an ISO 27001 certification and deploy the model on AWS Outposts

AnswerA

The EU AI Act mandates human oversight for high-risk systems, and a fundamental rights impact assessment is required for credit scoring models used by financial institutions.

Why this answer

The EU AI Act requires high-risk systems to have human oversight (Article 14), technical documentation, risk management, and transparency. Among the options, only the one including human review and documentation meets these requirements.

Full explanation →

879

MCQhard

A developer deployed this guardrail to block sensitive topics and sexual content. However, the model still generates responses about a specific sensitive topic that is not in the TopicPolicy. What should the developer do to prevent this?

A.Add a SensitiveInformationPolicy to filter PII

B.Increase the InputStrength of the content filter to MAX

C.Change the TopicPolicy Type from DENY to ALLOW

D.Add the specific topic to the TopicPolicy list

AnswerD

Adding the topic to the TopicPolicy with Type DENY will block it.

Why this answer

The guardrail's TopicPolicy only blocks the defined topic 'sensitive-topic'. To block additional topics, add them to the list. Option A (change type) would allow.

Option B (SensitiveInformationPolicy) is for PII. Option C (increase strength) does not add topics.

Full explanation →

880

MCQmedium

A company is deploying a generative AI application for customer support. They need to ensure that the model does not generate responses containing personally identifiable information (PII) even if it appears in the retrieved context. Which Bedrock feature should they configure?

A.Bedrock Knowledge Bases

B.Bedrock Agents

C.Bedrock Model Evaluation

D.Bedrock Guardrails

AnswerD

Guardrails can detect and redact PII in both input prompts and model responses, ensuring compliance.

Why this answer

Bedrock Guardrails include PII detection and redaction. Knowledge Bases retrieve context but do not filter PII. Agents orchestrate tasks but do not enforce PII rules.

Model evaluation tests performance but does not provide runtime controls.

Full explanation →

881

MCQhard

A company is deploying a machine learning model for real-time fraud detection. The model must have latency under 100ms. Which infrastructure choice is most appropriate?

A.Amazon SageMaker real-time endpoints

B.Amazon EC2 with Deep Learning AMI

C.Amazon SageMaker batch transform

D.Amazon SageMaker notebook instance

AnswerA

Real-time endpoints provide low-latency inference with automatic scaling.

Why this answer

Amazon SageMaker real-time endpoints are designed for low-latency inference, typically in the tens of milliseconds, making them suitable for real-time fraud detection where latency must be under 100ms. They deploy a model behind a persistent HTTPS endpoint that auto-scales to handle incoming requests with minimal delay.

Exam trap

The trap here is that candidates often confuse batch transform with real-time inference, assuming that any SageMaker inference capability can serve low-latency requests, but batch transform is explicitly asynchronous and designed for high-throughput, not low-latency.

How to eliminate wrong answers

Option B is wrong because Amazon EC2 with Deep Learning AMI requires manual setup of the inference server, scaling, and load balancing, which introduces operational overhead and cannot guarantee sub-100ms latency without significant custom engineering. Option C is wrong because Amazon SageMaker batch transform is designed for asynchronous, offline inference on large datasets, not for real-time, low-latency predictions. Option D is wrong because Amazon SageMaker notebook instance is an interactive development environment for building and testing models, not a production inference endpoint.

Full explanation →

882

MCQhard

A team is using Amazon Bedrock to generate images from text prompts. The generated images often contain artifacts and do not match the prompt description. Which combination of steps should the team take to improve image quality?

A.Fine-tune the model using SageMaker Ground Truth and increase the training epochs.

B.Increase the max token count and use a larger model variant.

C.Refine the prompt with more descriptive language and adjust the CFG scale and inference steps.

D.Use a different foundation model and increase the image resolution.

AnswerC

Better prompts and tuning inference parameters directly improve image quality.

Why this answer

Option C is correct because refining the prompt with more descriptive language helps the model better interpret the user's intent, while adjusting the CFG (Classifier-Free Guidance) scale controls how strictly the model adheres to the prompt, and increasing inference steps allows the diffusion process to produce higher-quality, artifact-free images. These are standard hyperparameters in diffusion-based image generation models on Amazon Bedrock, directly addressing both artifacts and prompt mismatch.

Exam trap

AWS often tests the misconception that image quality issues are best solved by model retraining or changing the model, rather than by adjusting inference-time parameters like CFG scale and inference steps, which are the immediate and correct levers for prompt adherence and artifact reduction.

How to eliminate wrong answers

Option A is wrong because fine-tuning a model using SageMaker Ground Truth and increasing training epochs is a data labeling and retraining approach that is overkill and not directly applicable to improving inference-time image quality for a pre-trained Bedrock model; it also does not address prompt adherence or artifact reduction. Option B is wrong because increasing the max token count and using a larger model variant does not fix artifacts or prompt mismatch—max token count affects text generation length, not image quality, and a larger model may not inherently improve prompt alignment without prompt engineering. Option D is wrong because using a different foundation model and increasing image resolution may change output characteristics but does not systematically address artifacts or prompt mismatch; higher resolution can even amplify artifacts if the underlying generation process is not optimized.

Full explanation →

883

MCQeasy

A data scientist is using Amazon SageMaker to train a large language model from scratch. Which AWS service is most suitable for managing the training infrastructure, including automatic scaling and spot instance recovery?

A.AWS Lambda function.

B.Amazon SageMaker Notebook instance.

C.Amazon SageMaker Training job.

D.Amazon EC2 with a custom setup.

AnswerC

SageMaker Training manages infrastructure, automatically recovers from spot interruptions, and scales.

Why this answer

Amazon SageMaker Training jobs are the most suitable service for managing training infrastructure because they provide built-in automatic scaling, managed spot instance recovery, and distributed training orchestration. This allows the data scientist to focus on model development rather than provisioning and managing EC2 instances, load balancers, or recovery scripts.

Exam trap

The AIF-C01 exam often tests the distinction between managed services (SageMaker Training) and unmanaged services (EC2 custom setup), where candidates mistakenly choose EC2 thinking they need full control, overlooking SageMaker's built-in spot recovery and scaling capabilities.

How to eliminate wrong answers

Option A is wrong because AWS Lambda functions are serverless compute services designed for short-running, event-driven tasks (max 15-minute execution time) and cannot manage long-running training jobs or infrastructure scaling. Option B is wrong because Amazon SageMaker Notebook instances are interactive development environments for prototyping and exploration, not designed to manage production training infrastructure or handle automatic scaling and spot instance recovery. Option D is wrong because Amazon EC2 with a custom setup requires manual provisioning, configuration of auto-scaling groups, and custom scripts for spot instance interruption handling, which is less efficient and more error-prone than SageMaker's managed training service.

Full explanation →

884

MCQhard

Refer to the exhibit. A SageMaker real-time endpoint is experiencing increasing latency and memory errors after running for a few hours. What is the most likely cause and recommended fix?

A.Scale the endpoint to a larger instance type, such as ml.r5.large

B.Enable auto-scaling to add instances during high load

C.Use SageMaker Debugger to identify and fix a memory leak in the inference code

D.Use SageMaker Model Monitor to detect data drift

AnswerC

The increasing memory usage over time indicates a leak; Debugger can help identify the issue.

Why this answer

Option C is correct because the symptoms—increasing latency and memory errors after running for a few hours—point to a memory leak in the inference code. SageMaker Debugger can monitor system metrics like memory utilization and detect anomalies, helping to identify the root cause of the leak. Fixing the memory leak directly resolves the progressive degradation, whereas scaling or auto-scaling only masks the symptom.

Exam trap

The AIF-C01 exam often tests the distinction between scaling solutions (which address capacity) and debugging tools (which address code defects), trapping candidates who confuse symptom relief with root cause resolution.

How to eliminate wrong answers

Option A is wrong because scaling to a larger instance type (e.g., ml.r5.large) provides more memory but does not address the underlying memory leak; the leak will eventually exhaust the larger memory pool as well. Option B is wrong because enabling auto-scaling adds more instances to handle load, but it does not fix the memory leak in the inference code; each instance will still experience the same progressive memory exhaustion. Option D is wrong because SageMaker Model Monitor detects data drift (changes in input data distribution), not memory leaks or latency issues caused by code defects.

Full explanation →

885

MCQeasy

Which AWS service can extract text from scanned PDF documents, including handwriting and checkboxes?

A.Amazon Transcribe

B.Amazon Comprehend

C.Amazon Textract

D.Amazon Rekognition

AnswerC

Textract is designed to extract text and data from documents, including handwriting and forms.

Why this answer

Amazon Textract uses machine learning to extract text, handwriting, and form data from scanned documents. Rekognition is for images/video analysis, Comprehend for NLP, Transcribe for speech-to-text.

Full explanation →

886

MCQeasy

A marketing agency uses a foundation model to generate images for social media campaigns. Some generated images have contained violent or inappropriate content, damaging the brand. The agency needs to prevent such content from being displayed automatically. They are using Amazon Bedrock for image generation with Stable Diffusion. What is the most effective way to filter out inappropriate images?

A.Use Amazon Rekognition to analyze images after generation.

B.Manually review all images before posting.

C.Restrict the prompt to avoid triggering keywords.

D.Enable the safety checker in Amazon Bedrock's image generation models.

AnswerD

Built-in safety checker filters out inappropriate images without additional overhead.

Why this answer

Option B is correct because Stable Diffusion models in Bedrock include a safety checker that can detect and block NSFW content before output. Option A (Amazon Rekognition) introduces additional cost and latency. Option C (manual review) is not scalable.

Option D (restrict prompt) is unreliable as the model can still generate inappropriate content from safe prompts.

Full explanation →

887

MCQhard

An e-commerce company is using a foundation model to generate product descriptions. They want to reduce costs by caching frequently requested descriptions. Which AWS service should they use to implement a cache?

A.Amazon CloudFront

B.Amazon DynamoDB

C.Amazon S3

D.Amazon ElastiCache

AnswerD

ElastiCache provides low-latency caching for frequently used data.

Why this answer

Amazon ElastiCache is the correct choice because it provides an in-memory caching layer (using Redis or Memcached) that can store frequently requested product descriptions, reducing the need to invoke the foundation model repeatedly. This directly lowers inference costs and latency by serving cached responses instead of generating new ones each time.

Exam trap

The AIF-C01 exam often tests the distinction between caching at the application layer (ElastiCache) versus caching at the content delivery layer (CloudFront), leading candidates to mistakenly choose CloudFront for any caching need.

How to eliminate wrong answers

Option A is wrong because Amazon CloudFront is a content delivery network (CDN) that caches static and dynamic content at edge locations, but it is not designed for application-level caching of model-generated text; it caches HTTP responses, not arbitrary key-value data. Option B is wrong because Amazon DynamoDB is a fully managed NoSQL database optimized for high-throughput, low-latency reads and writes, but it is not a caching service; using it as a cache would incur higher costs and lack native TTL-based eviction policies for transient data. Option C is wrong because Amazon S3 is an object storage service for storing large amounts of unstructured data, not a low-latency cache; retrieving descriptions from S3 would introduce significant latency compared to an in-memory cache, defeating the purpose of cost reduction.

Full explanation →

888

MCQmedium

A company uses Amazon Bedrock to generate marketing content. They want to reduce costs while maintaining response quality. Which action is most effective?

A.Fine-tune a larger model to improve accuracy and reduce retries.

B.Increase the temperature parameter to get shorter responses.

C.Select a smaller foundation model that still meets accuracy requirements.

D.Cache previous responses to reuse for similar prompts.

AnswerC

Smaller models have lower per-token costs and are faster.

Why this answer

Option D is correct because selecting a smaller, efficient foundation model can reduce cost per token while maintaining quality for simple tasks. Option A is wrong because increasing temperature does not reduce cost. Option B is wrong because caching may not be effective for variable outputs.

Option C is wrong because fine-tuning increases cost.

Full explanation →

889

MCQhard

A financial institution is deploying a fraud detection model using Amazon SageMaker. The model must be able to handle sudden spikes in inference requests during promotional events while keeping costs low. The team wants to use a serverless architecture to avoid provisioning idle capacity and to scale automatically from zero. However, the inference latency requirement is under 5 seconds for each request. Which SageMaker inference option should they choose?

A.Use Amazon SageMaker Serverless Inference

B.Use Amazon SageMaker Multi-Model Endpoints

C.Use Amazon SageMaker real-time endpoints with auto-scaling

D.Use Amazon SageMaker Asynchronous Inference

AnswerA

Serverless Inference scales automatically from zero and reduces costs during idle periods.

Why this answer

Amazon SageMaker Serverless Inference is the correct choice because it automatically scales from zero to handle sudden spikes in inference requests, aligning with the requirement to avoid provisioning idle capacity. It also meets the sub-5-second latency requirement for fraud detection, as it is designed for low-latency, on-demand inference without managing underlying infrastructure.

Exam trap

AWS often tests the misconception that serverless inference cannot meet low-latency requirements, but SageMaker Serverless Inference is specifically designed for sub-second to few-second latency, making it suitable for real-time fraud detection scenarios.

How to eliminate wrong answers

Option B is wrong because Multi-Model Endpoints require provisioned instances and do not scale from zero; they are designed to host multiple models on a single endpoint but still incur costs for idle capacity. Option C is wrong because real-time endpoints with auto-scaling still require a baseline of provisioned instances, which can lead to idle capacity costs during low-traffic periods, and they do not scale from zero. Option D is wrong because Asynchronous Inference is intended for large payloads and longer processing times (typically minutes), not for sub-5-second latency requirements, and it queues requests rather than providing real-time responses.

Full explanation →

890

MCQmedium

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

A.Use Retrieval-Augmented Generation (RAG) with the policy documents indexed in a vector store

B.Use a larger foundation model with a longer context window and paste all documents into each prompt

C.Fine-tune a base LLM on the policy documents monthly

D.Train a custom model from scratch on the policy documents each month

AnswerA

RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.

Why this answer

RAG (Retrieval-Augmented Generation) allows the LLM to retrieve relevant document sections at inference time, so knowledge stays current without retraining. The other options either require expensive retraining for each update or lack document grounding.

Full explanation →

891

MCQeasy

What is the main advantage of using the Amazon Bedrock Converse API over the InvokeModel API for building conversational applications?

Answer options not yet available.

Why this answer

The Converse API is designed for multi-turn conversations, automatically managing conversation history and context.

Full explanation →

892

MCQmedium

A developer is building a multi-step reasoning agent using Amazon Bedrock Agents. The agent needs to first check inventory levels via a database query, then call a shipping API to calculate delivery dates, and finally compose a response. How should the developer define the tool integrations?

A.Define two separate Lambda functions and let the agent orchestrate them via a custom step function

B.Use Amazon SageMaker Pipelines to orchestrate the steps

C.Create action groups for the database query and the shipping API, and configure the agent to use them in its orchestration

D.Embed the logic in a single Lambda function that calls both the database and shipping API

AnswerC

Action groups are the correct abstraction for tool integration in Bedrock Agents.

Why this answer

Action groups in Bedrock Agents encapsulate the tools (APIs, databases) that the agent can invoke. The agent's reasoning engine decides the sequence of calls.

Full explanation →

893

MCQhard

A healthcare organization wants to use generative AI to draft clinical notes from patient-physician conversations. They must comply with HIPAA and minimize false medical information. Which approach should they take?

A.Use Amazon SageMaker JumpStart with a publicly available clinical model and no additional modifications.

B.Use a generic open-source LLM hosted on Amazon EC2 with manual prompt engineering.

C.Use Amazon Bedrock with a HIPAA-eligible foundation model and connect it to a medical knowledge base via RAG.

D.Use Amazon Bedrock with a large foundation model and a high temperature setting for creativity.

AnswerC

Ensures compliance and accuracy through grounding on trusted medical sources.

Why this answer

Option A is correct because Amazon Bedrock offers HIPAA-eligible models and allows grounding with medical knowledge bases to reduce hallucinations. Option B is wrong because it does not use grounding. Option C is wrong because open-source LLMs may not be HIPAA-compliant.

Option D is wrong because increasing temperature introduces more randomness, worsening accuracy.

Full explanation →

894

Multi-Selectmedium

Which THREE of the following are factors to consider when selecting a foundation model for a text generation task?

Select 3 answers

A.Supported output modalities

B.Pricing per token

C.Model size (parameters)

D.Training data source and diversity

E.Availability of automatic scaling

AnswersB, C, D

Cost per token affects operational expense.

Why this answer

Pricing per token is a critical factor because foundation model APIs (e.g., Amazon Bedrock, OpenAI) charge based on the number of input and output tokens. For text generation tasks, token costs directly impact operational budgets, especially for high-volume or long-context applications. Selecting a model with lower per-token pricing can significantly reduce inference costs without sacrificing quality.

Exam trap

AWS often tests the distinction between model-level attributes (e.g., token pricing, training data, parameter count) and platform-level operational features (e.g., scaling, output modalities), leading candidates to incorrectly select options like automatic scaling or multimodal support for a text-only task.

Full explanation →

895

MCQmedium

A company uses Amazon Bedrock to generate summarizations of lengthy reports. Users report that the summaries are too verbose and include excessive detail. Which prompt engineering technique should the team apply to address this issue?

A.Reduce the input context length to limit available information.

B.Increase the maxTokens parameter in the inference request.

C.Include few-shot examples of desired outputs.

D.Add explicit constraints like 'Provide a concise summary in two sentences.'

AnswerD

Explicit constraints directly guide the model to produce shorter output, addressing verbosity effectively.

Why this answer

Option D is correct because adding explicit constraints like 'Provide a concise summary in two sentences' directly instructs the model to limit verbosity and detail. This prompt engineering technique uses clear, specific instructions to control output length and style, which is the most effective way to address overly verbose summaries without altering model parameters or input data.

Exam trap

The trap here is that candidates confuse reducing input length (Option A) with controlling output length, or they mistakenly think increasing maxTokens (Option B) can somehow shorten output, when in fact it does the opposite.

How to eliminate wrong answers

Option A is wrong because reducing input context length does not guarantee concise output; the model may still generate verbose summaries from the remaining text, and it risks losing critical information needed for accurate summarization. Option B is wrong because increasing the maxTokens parameter actually allows the model to generate longer outputs, which would exacerbate the verbosity issue rather than solve it. Option C is wrong because few-shot examples can guide output format but are less direct and reliable than explicit constraints; they may not consistently enforce conciseness, especially if the examples themselves are not perfectly aligned with the desired brevity.

Full explanation →

896

MCQhard

A developer is building a RAG application on Amazon Bedrock. They notice that the model sometimes generates answers that are not supported by the retrieved documents. To reduce this, they want to enforce that the model only uses the provided context. Which Bedrock feature should they use?

A.Enable 'grounding' in Bedrock Guardrails

B.Increase the number of retrieved documents

C.Use a larger context window model

D.Adjust the temperature parameter to 0

AnswerA

Grounding check forces the model to stick to the retrieved context.

Why this answer

Bedrock Guardrails' grounding check ensures the model's response is grounded in the provided source documents, reducing hallucinations. The other options do not enforce grounding.

Full explanation →

897

MCQmedium

A company uses Amazon Bedrock and needs to log all model invocations for audit purposes. The logs must be stored in a central S3 bucket and also sent to CloudWatch Logs for real-time monitoring. Which configuration should they use?

A.Enable Amazon Macie to monitor Bedrock responses

B.Enable AWS CloudTrail to capture Bedrock API calls and configure CloudTrail logs to S3 and CloudWatch

C.Use AWS Lambda to capture responses and write to S3 and CloudWatch

D.Configure Amazon Bedrock model invocation logging to send logs to both S3 and CloudWatch Logs

AnswerD

Bedrock supports logging model invocations to S3 and CloudWatch for audit and monitoring.

Why this answer

Bedrock model invocation logging can be configured to send logs to both S3 and CloudWatch Logs simultaneously. This is a built-in feature of Bedrock logging settings.

Full explanation →

898

MCQmedium

A.Use Retrieval-Augmented Generation (RAG) with the policy documents indexed in a vector store

B.Fine-tune a base LLM on the policy documents monthly

C.Use a larger foundation model with a longer context window and paste all documents into each prompt

D.Train a custom model from scratch on the policy documents each month

AnswerA

RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.

Why this answer

Full explanation →

899

MCQeasy

A team is evaluating a classification model. The confusion matrix shows: TP=80, FN=20, FP=10, TN=90. What is the precision?

A.0.89

B.0.75

C.0.80

D.0.90

AnswerA

Precision = 80/(80+10) = 0.8889 ≈ 0.89.

Why this answer

Precision is calculated as TP / (TP + FP). Here, TP=80 and FP=10, so precision = 80 / (80 + 10) = 80 / 90 = 0.888..., which rounds to 0.89. This metric measures the proportion of positive identifications that were actually correct.

Exam trap

The AIF-C01 exam often tests the distinction between precision and recall by providing confusion matrix values that make one metric easy to miscalculate if you confuse the denominator (TP+FP vs TP+FN).

How to eliminate wrong answers

Option B (0.75) is wrong because it incorrectly uses FN in the denominator, likely confusing precision with recall (TP / (TP + FN)). Option C (0.80) is wrong because it uses only TP divided by the total number of actual positives (TP + FN), which is recall, not precision. Option D (0.90) is wrong because it uses TN in the denominator or calculates accuracy (TP + TN) / total, which is not precision.

Full explanation →

900

MCQmedium

A large enterprise uses Amazon Bedrock to power a conversational agent that handles customer service inquiries. The agent is built using Bedrock Agents and retrieves information from a knowledge base that contains product documentation and FAQs. Recently, users have reported that the agent sometimes provides incorrect information that contradicts the knowledge base. The development team verified that the knowledge base contains accurate and up-to-date data. They also confirmed that the retrieval process correctly fetches relevant documents. However, the agent occasionally ignores the retrieved context and generates plausible-sounding but incorrect answers. The team is concerned about customer trust and wants to improve the accuracy of the agent's responses without overhauling the architecture. They have already tuned the prompt template to instruct the model to use the context. The issue persists. Which additional action should the team take to reduce the number of hallucinated responses?

A.Reduce the chunk size of documents in the knowledge base to retrieve more granular information.

B.Switch to a larger foundation model with more parameters.

C.Increase the temperature parameter of the foundation model.

D.Add explicit instructions in the system prompt to require the model to base its answers solely on the retrieved context and to state when it doesn't have enough information.

AnswerD

Correct: Strengthening the prompt with explicit directives can reduce hallucinations by forcing the model to rely on the provided context.

Why this answer

Option B directly addresses the model ignoring context by strengthening the instruction. Option A increases randomness, Option C does not guarantee use of context, Option D may not help if retrieval is already good.

Full explanation →

AWS Certified AI Practitioner AIF-C01 (AIF-C01) — Questions 826–900