AWS Certified AI Practitioner AIF-C01 (AIF-C01) — Questions 751825

1000 questions total · 14pages · All types, answers revealed

Page 10

Page 11 of 14

Page 12
751
Multi-Selectmedium

Which TWO factors are most important when selecting a foundation model in Amazon Bedrock for a text summarization task with strict latency requirements?

Select 2 answers
A.Average response latency per request.
B.Model size in billions of parameters.
C.Maximum input token limit.
D.Output quality and token efficiency for summarization tasks.
E.Availability of fine-tuning capability for domain adaptation.
AnswersA, D

Low latency is critical for real-time summarization.

Why this answer

Options A and B are correct. Response latency directly impacts user experience, so a model with low latency is essential. Output quality/token ensures the summaries are accurate and concise.

Option C is wrong because fine-tuning increases cost and latency. Option D is wrong because model size affects latency but latency itself is the direct factor. Option E is wrong because input token limit is relevant but not as critical as latency and quality for this use case.

752
MCQmedium

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

A.Use a larger foundation model with a longer context window and paste all documents into each prompt
B.Train a custom model from scratch on the policy documents each month
C.Fine-tune a base LLM on the policy documents monthly
D.Use Retrieval-Augmented Generation (RAG) with the policy documents indexed in a vector store
AnswerD

RAG retrieves relevant document chunks at query time, ensuring the chatbot always answers from the latest uploaded documents without any model retraining.

Why this answer

RAG (Retrieval-Augmented Generation) allows the LLM to retrieve relevant document sections at inference time, so knowledge stays current without retraining. The other options either require expensive retraining for each update or lack document grounding.

753
Multi-Selecteasy

Which TWO actions can help reduce the likelihood of hallucinations in a generative AI model used for question answering?

Select 2 answers
A.Increase the maximum token count to allow more complete answers.
B.Use Retrieval Augmented Generation (RAG) with a trusted knowledge base.
C.Fine-tune the model on the training data used for the application.
D.Set a lower temperature parameter (e.g., 0.1) to reduce randomness.
E.Use a larger foundation model with more parameters.
AnswersB, D

Grounding on real documents reduces hallucinations.

Why this answer

Options A and C are correct. Grounding the model on a knowledge base (RAG) reduces hallucinations by providing factual context. Reducing the temperature parameter makes the model more deterministic, lowering the chance of making up information.

Option B is wrong because fine-tuning on the same data that caused hallucinations may not fix the issue. Option D is wrong because increasing max tokens may allow more hallucinated content. Option E is wrong because using a larger model often increases hallucination risk due to more parameters.

754
MCQhard

A company uses an LLM to summarize medical research papers. They are concerned about hallucinations. Which combination of techniques would most effectively reduce hallucinations in this context?

A.Increase temperature and top-p sampling parameters
B.Use a smaller model with less capacity
C.Few-shot prompting and fine-tuning on more data
D.Retrieval-Augmented Generation (RAG) and Bedrock Guardrails
AnswerD

RAG retrieves relevant documents to ground the model, and Bedrock Guardrails can block non-factual or harmful outputs, effectively reducing hallucinations.

Why this answer

Retrieval-Augmented Generation (RAG) grounds the model in retrieved documents, and Bedrock Guardrails can enforce content policies and factuality checks, together reducing hallucinations.

755
MCQeasy

A company wants to automatically summarize customer support tickets into a short paragraph. Which AWS service is MOST appropriate for this task?

A.Amazon Bedrock
B.Amazon Rekognition
C.Amazon Polly
D.Amazon Comprehend
AnswerA

Amazon Bedrock provides access to foundation models that can summarize text.

Why this answer

Amazon Bedrock provides access to foundation models that can perform summarization. Option C is correct because Bedrock is a managed service offering pre-trained models for tasks like text summarization. Option A (Amazon Comprehend) is for NLP tasks like entity extraction, not summarization.

Option B (Amazon Rekognition) is for image/video analysis. Option D (Amazon Polly) is text-to-speech.

756
MCQeasy

A retail company uses a recommendation system that occasionally suggests inappropriate products to minors. Which responsible AI practice should be applied?

A.Implement human review of flagged recommendations
B.Rely solely on user feedback to improve
C.Disable the recommendation system entirely
D.Increase the volume of training data
AnswerA

Human-in-the-loop ensures responsible oversight.

Why this answer

The correct practice is to implement human review of flagged recommendations. This aligns with the responsible AI principle of accountability, where automated systems must have oversight mechanisms to catch and correct inappropriate outputs, especially when minors are involved. Human-in-the-loop (HITL) validation ensures that edge cases or subtle context (e.g., age-inappropriate product suggestions) are caught before they reach end users, rather than relying solely on automated filters or feedback loops.

Exam trap

AWS often tests the misconception that more data or automation alone can solve fairness and safety issues, when in fact responsible AI requires explicit governance mechanisms like human oversight for high-stakes or vulnerable-user scenarios.

How to eliminate wrong answers

Option B is wrong because relying solely on user feedback to improve is reactive and can expose minors to harm before any corrective action is taken; feedback loops are slow and may not capture subtle or rare inappropriate recommendations. Option C is wrong because disabling the recommendation system entirely is an extreme, non-scalable response that eliminates business value and does not teach the system to behave responsibly; responsible AI aims to mitigate harm, not abandon functionality. Option D is wrong because increasing the volume of training data does not inherently address the problem of inappropriate recommendations; if the training data itself contains biased or unlabeled age-sensitive content, more data can amplify the issue rather than fix it.

757
MCQmedium

Refer to the exhibit. A user invoked a Claude model using provisioned throughput and received a ThrottlingException. Which is the most likely cause?

A.The model is not available in the region
B.The provisioned throughput request per minute limit was exceeded
C.The prompt was too long
D.The inference type should be ON_DEMAND
AnswerB

Throttling occurs when the request rate exceeds the allowed limit for the provisioned throughput.

Why this answer

Option A is correct. Provisioned throughput has a requests-per-minute limit, and exceeding it causes a ThrottlingException. Option B would produce a different error.

Option C would be a validation error, not throttling. Option D is not the cause because PROVISIONED is valid.

758
MCQmedium

A company uses Amazon SageMaker to train a model. The training job fails with 'InsufficientInstanceCapacity' error. What is the most likely cause?

A.The request rate is too high.
B.The dataset size exceeds the instance storage limit.
C.The requested instance type is not available in the specified region.
D.The training image is not compatible with the instance type.
AnswerC

This error occurs when AWS cannot provision the instance due to capacity constraints.

Why this answer

The 'InsufficientInstanceCapacity' error in Amazon SageMaker indicates that AWS does not currently have enough available capacity for the requested instance type in the specified region or Availability Zone. This is a common transient error when demand for a particular instance type exceeds supply, and it is not related to request rate, dataset size, or image compatibility.

Exam trap

The AIF-C01 exam often tests the distinction between capacity errors and throttling errors, so the trap here is confusing 'InsufficientInstanceCapacity' with a rate-limiting or quota error, leading candidates to incorrectly select Option A.

How to eliminate wrong answers

Option A is wrong because 'InsufficientInstanceCapacity' is a capacity error, not a throttling error; throttling (e.g., from high request rate) would return a 'ThrottlingException' or 'RequestLimitExceeded' error. Option B is wrong because dataset size exceeding instance storage limits would cause an 'OutOfMemory' or 'DiskFull' error, not a capacity error. Option D is wrong because image compatibility issues would result in a 'ClientError' or 'ImageNotFoundException', not an instance capacity error.

759
MCQhard

A financial services company uses a machine learning model to approve loan applications. The model is a gradient boosting classifier trained on historical loan data. Recently, the company noticed that the model's approval rate for applicants from a certain demographic group is significantly lower than for other groups, even though the model's overall accuracy remains high. The data science team has been asked to address this potential bias while minimizing the impact on overall model performance. The team has access to the training data and the trained model. They have limited time and budget. Which course of action should the team take first?

A.Remove the sensitive attribute from the training data and retrain the model.
B.Collect more data from the under-represented demographic group and retrain the model.
C.Analyze the training data for bias and retrain the model using bias mitigation techniques such as reweighting.
D.Adjust the model's decision threshold for the affected group after deployment.
AnswerC

This directly addresses the root cause and is resource-efficient.

Why this answer

The most efficient first step is to analyze the training data for bias and then retrain the model with bias mitigation techniques like reweighting. Option A is wrong because collecting more data is resource-intensive and may not address bias. Option C is wrong because feature engineering may not help if the bias is in the labels.

Option D is wrong because post-hoc adjustments can introduce other issues and may not be as effective as addressing bias during training.

760
Multi-Selecthard

A company is training a deep learning model for image classification. Which THREE practices help reduce overfitting? (Choose three.)

Select 3 answers
A.L2 regularization
B.Increasing model depth
C.Increasing learning rate
D.Dropout
E.Data augmentation
AnswersA, D, E

L2 regularization penalizes large weights, reducing overfitting.

Why this answer

L2 regularization (also known as weight decay) adds a penalty proportional to the square of the weight magnitudes to the loss function. This discourages the model from learning overly complex patterns by forcing weights to stay small, which reduces overfitting by limiting the model's capacity to fit noise in the training data.

Exam trap

The AIF-C01 exam often tests the misconception that increasing model complexity (depth) or tuning the learning rate can mitigate overfitting, when in fact these changes either exacerbate the problem or address unrelated training dynamics.

761
MCQeasy

Which AWS service allows teams to collaborate on building generative AI applications with a visual interface, share prompts, and manage prompt versions?

A.Amazon Bedrock Playground
B.Amazon Bedrock Studio
C.AWS CodeCommit
D.Amazon SageMaker Studio
AnswerB

Bedrock Studio enables team collaboration, prompt versioning, and visual app building.

Why this answer

Amazon Bedrock Studio is a collaborative environment for prompt engineering, versioning, and sharing. Bedrock Playground is for individual testing only.

762
MCQhard

An organization uses SageMaker JumpStart to deploy a foundation model for real-time inference. They observe high latency. What is the most effective way to reduce latency?

A.Compile the model with SageMaker Neo
B.Use a larger instance with more memory
C.Use batch transform instead
D.Enable SageMaker Inference Recommender
AnswerA

Neo compiles models for faster inference on specific hardware.

Why this answer

SageMaker Neo compiles the model to optimize it for the target hardware, reducing inference latency by applying hardware-specific optimizations such as kernel fusion, quantization, and memory layout tuning. This directly addresses the high latency issue for real-time inference without changing the instance type or inference mode.

Exam trap

AWS often tests the misconception that increasing instance size or switching to batch processing is the primary solution for latency, when in fact model compilation with SageMaker Neo is the most direct and cost-effective optimization for real-time inference.

How to eliminate wrong answers

Option B is wrong because using a larger instance with more memory may reduce latency due to increased compute capacity, but it is less effective and more costly than model compilation, which optimizes the model itself for the existing hardware. Option C is wrong because batch transform is designed for offline, asynchronous inference on large datasets, not for real-time inference, and it would not reduce latency for a real-time endpoint. Option D is wrong because SageMaker Inference Recommender helps select the optimal instance type and configuration for a given model, but it does not directly reduce latency; it recommends deployment parameters, whereas compilation actively optimizes the model.

763
MCQeasy

A company wants to prevent an Amazon Bedrock chatbot from discussing specific prohibited topics like competitor pricing. Which Bedrock feature should they configure?

A.Bedrock Knowledge Base
B.Bedrock Guardrails – topic denial
C.Bedrock Agents
D.Bedrock Model Evaluation
AnswerB

Topic denial in Guardrails explicitly blocks defined subjects.

Why this answer

Bedrock Guardrails allow topic denial by defining denied topics. The guardrail will block responses that match these topics.

764
Multi-Selectmedium

A data science team is evaluating the output quality of a text generation model. They want to use both automated metrics and human judgment. Which THREE approaches should they include in their evaluation strategy? (Select THREE.)

Select 3 answers
A.Conduct human evaluation with a rating rubric
B.Measure inference latency under load
C.Compute perplexity on the generated text
D.Use ROUGE or BLEU scores against reference texts
E.Run the model on a task‑specific benchmark dataset
AnswersA, D, E

Human evaluators score outputs on criteria like fluency, relevance, and accuracy using a standardized rubric.

Why this answer

ROUGE/BLEU are automated metrics; human evaluation with rubric provides qualitative assessment; task‑specific benchmarks measure performance on curated test sets. Perplexity is a training metric, not for evaluating output quality. Latency is a performance metric, not quality.

765
Multi-Selectmedium

Which TWO actions would improve the grounding of responses from a generative AI model using RAG? (Choose 2)

Select 2 answers
A.Fine-tune the model on unrelated data
B.Reduce the context window to save tokens
C.Increase the model's temperature parameter
D.Use RAG with a knowledge base of relevant documents
E.Include source citations in the prompt instructions
AnswersD, E

RAG provides retrieved context, reducing reliance on model's parametric knowledge.

Why this answer

Using RAG with relevant documents directly grounds responses in factual data. Including source citations in prompts encourages the model to base answers on retrieved information. Increasing temperature or reducing context would likely hurt grounding.

Fine-tuning on unrelated data does not help.

766
MCQeasy

Which of the following is a primary benefit of using Bedrock Agents for building generative AI applications?

A.Automatically fine-tune the underlying model on new data
B.Orchestrate multi-step tasks and call external APIs via action groups
C.Optimize prompts automatically without any manual tuning
D.Guarantee that the model's responses are factually correct
AnswerB

Agents can break down complex requests, call APIs, and combine results to complete tasks.

Why this answer

Bedrock Agents enable multi-step reasoning and tool use, allowing the model to perform complex sequences of actions. Fine-tuning is not an agent capability. Guardrails are separate.

Prompt optimization is done at the prompt level, not by agents.

767
MCQmedium

A data scientist is evaluating a logistic regression model for a binary classification task. The model's AUC-ROC score is 0.95 on the training set and 0.51 on the test set. What is the MOST likely issue?

A.The model is overfitting the training data
B.The test set is too small
C.The learning rate is too low
D.The model is underfitting the training data
AnswerA

High training performance and poor test performance is classic overfitting.

Why this answer

A large gap between training and test AUC-ROC indicates overfitting — the model memorizes training data but fails to generalize. Cross-validation can help detect and mitigate this. Data leakage or class imbalance could also contribute, but overfitting is the primary symptom.

768
MCQhard

A financial services firm needs an LLM-powered application that analyzes customer transaction data and generates compliance reports. The data contains personally identifiable information (PII). The firm must ensure that no training data includes PII, and that the LLM never outputs PII. Which combination of AWS services and practices should they use?

A.Use RAG to retrieve transaction data from a vector database and include it in the prompt to the LLM
B.Fine-tune an Amazon Titan model on the transaction data after masking PII, then use the fine-tuned model for inference
C.Host the model on Amazon SageMaker and apply differential privacy during training
D.Use a pre-trained foundation model via Amazon Bedrock with a system prompt that instructs the model not to output PII, and enable Bedrock’s data protection
AnswerD

Pre-trained model avoids PII in training; system prompt and data protection guardrails prevent PII in outputs.

Why this answer

Using Amazon Bedrock with a pre-trained foundation model (no fine-tuning) ensures PII is not in training data. A system prompt instructing the model to avoid PII, combined with Bedrock’s built-in data protection, prevents PII in outputs. Fine-tuning or RAG with sensitive data would risk exposure.

769
MCQmedium

A financial services company needs to deploy a chatbot that answers customer inquiries about account balances and transaction history. The chatbot must never reveal sensitive information from other customers. Which security measure should be implemented in the prompt?

A.Set temperature to 0.0
B.Use few-shot examples of safe responses
C.Enable streaming responses to monitor output
D.Include a system prompt stating 'Do not reveal any information about other customers'
AnswerD

System prompts define the model's behavior and can enforce security constraints.

Why this answer

System prompts set the overall behavior and constraints for the model. An explicit instruction to avoid revealing other customers' data is a direct guardrail.

770
Multi-Selecthard

Which THREE are best practices for building a secure and scalable generative AI application using Amazon Bedrock? (Choose 3)

Select 3 answers
A.Implement guardrails to filter harmful content
B.Deploy models on EC2 instances for better control
C.Store API keys in source code for easy access
D.Use AWS KMS to encrypt data and model artifacts
E.Use foundation models from multiple providers via Bedrock
AnswersA, D, E

Guardrails enforce content policies and prevent inappropriate outputs.

Why this answer

Using multiple foundation models through Bedrock's multi-model support allows flexibility and best-of-breed selection. Guardrails provide content safety. KMS encryption protects data at rest and in transit.

Storing keys in source code is insecure. EC2 deployment is not applicable to Bedrock's serverless model.

771
MCQmedium

A data scientist is building a model to predict housing prices. The dataset contains features such as square footage, number of bedrooms, and location. After training a linear regression model, the RMSE on the test set is significantly higher than on the training set. What is the MOST likely cause?

A.Data leakage from test to training set
B.Overfitting
C.Underfitting
D.Insufficient training data
AnswerB

Overfitting means the model has learned noise in the training data, leading to poor generalization and a large gap between training and test error.

Why this answer

A large gap between training and test RMSE indicates overfitting – the model performs well on training data but generalizes poorly. Underfitting would show high error on both sets, and data leakage could cause artificially good performance. Bias-variance tradeoff is the underlying concept, but overfitting is the specific diagnosis.

772
Multi-Selectmedium

A data scientist is evaluating whether to use fine-tuning or Retrieval-Augmented Generation (RAG) for a legal document analysis application. Which TWO statements correctly describe when to use each approach?

Select 2 answers
A.Use RAG to reduce model hallucinations without any external data
B.Use fine-tuning to teach the model a specific output format or style
C.Use fine-tuning to enable the model to access real-time information
D.Use fine-tuning to inject new factual knowledge into the model
E.Use RAG when the knowledge base changes frequently
AnswersB, E

Fine-tuning is effective for adapting tone, format, or behavior.

Why this answer

RAG is ideal for dynamic knowledge; fine-tuning is for adapting style/format but not for adding new facts.

773
Multi-Selecthard

Which THREE are benefits of using Amazon Bedrock over self-managing foundation models on EC2? (Choose THREE.)

Select 3 answers
A.Built-in integration with AWS services such as AWS CloudWatch and AWS CloudTrail.
B.Lower data transfer costs between cloud regions.
C.Access to a curated set of foundation models from different providers.
D.Managed infrastructure for model hosting and scaling.
E.Greater control over model fine-tuning and customization.
AnswersA, C, D

Bedrock natively logs to CloudWatch and CloudTrail for monitoring and auditing.

Why this answer

Option A is correct because Amazon Bedrock provides built-in integration with AWS services like CloudWatch for monitoring model invocation metrics and CloudTrail for auditing API calls. This eliminates the need to manually set up logging and monitoring infrastructure when self-managing foundation models on EC2, where you would have to configure these integrations yourself.

Exam trap

The trap here is that candidates may confuse 'managed infrastructure' with 'greater control'—Bedrock simplifies operations but reduces customization flexibility, so option E is a common distractor for those who think managed services offer more control than self-managed solutions.

774
MCQeasy

An e-commerce company uses an LLM to generate product descriptions. They observe that occasionally the model outputs factually incorrect information about products. What is the term for this phenomenon?

A.Bias amplification
B.Overfitting
C.Hallucination
D.Concept drift
AnswerC

Hallucination is the correct term for when an LLM produces plausible-sounding but factually incorrect content.

Why this answer

Hallucinations are when an LLM generates incorrect or nonsensical information that is not grounded in the input or training data.

775
MCQhard

A company fine-tunes a foundation model using SageMaker to create a domain-specific chatbot. After deployment on Bedrock, the model shows high confidence in incorrect answers. What is the most likely cause and its solution?

A.The model was not pre-trained on enough data; use a larger base model
B.The training data was imbalanced; collect more diverse data
C.The model is overfitting; apply regularization techniques during fine-tuning
D.The inference temperature is too low; increase it
AnswerC

Overfitting leads to overconfidence on training patterns. Regularization helps generalize better.

Why this answer

Overfitting during fine-tuning can cause the model to be overly confident even when wrong. Regularization (e.g., early stopping, dropout) reduces overconfidence.

776
MCQeasy

Which AWS service provides human review workflows to handle low-confidence predictions or high-risk decisions in an AI system?

A.AWS Lambda
B.Amazon Augmented AI (A2I)
C.Amazon Bedrock
D.Amazon SageMaker Ground Truth
AnswerB

A2I allows you to define conditions (e.g., low confidence) to route predictions to human reviewers.

Why this answer

Amazon Augmented AI (A2I) enables human review of predictions when the model confidence is low or when the decision is high-risk. SageMaker Ground Truth is for data labeling, not for production review. Bedrock is for foundation models.

Lambda is compute, not a review service.

777
MCQhard

A financial services company uses Bedrock Agents to automate a multi-step loan approval process. The agent needs to call an external credit scoring API and a compliance database, then combine results. The agent currently fails when the API returns a 503 error. How should the practitioner address this?

A.Adjust the Bedrock Agent's granularity setting to 'high'
B.Reduce the agent's trace truncation limit to shorten the context
C.Implement a custom Lambda function for the action group to handle errors and retries
D.Use Bedrock Guardrails with a topic denial rule for error messages
AnswerC

Lambda functions in action groups can implement custom logic, including retries on 503 errors, before returning results to the agent.

Why this answer

Lambda functions integrated with action groups can implement retry logic, error handling, and custom processing. Granularity tuning or trace truncation does not handle errors. Guardrails are for content filtering, not API error handling.

778
MCQmedium

A financial services company is deploying a foundation model to analyze customer sentiment from call transcripts. The model outputs must be consistent and deterministic for auditing purposes. Which parameter configuration should the company use?

A.Set temperature to 0.1 and top_p to 0.9.
B.Set temperature to 0.7 and top_p to 1.0.
C.Set temperature to 0.5 and top_p to 0.5.
D.Set temperature to 0 and top_p to 1.
AnswerD

Temperature 0 makes the model deterministic.

Why this answer

Setting temperature to 0 and top_p to 1 forces the model to always select the highest-probability token at each step, producing deterministic and repeatable outputs. This is essential for auditing and compliance in financial services, where consistency is required. Any nonzero temperature introduces randomness, which undermines determinism.

Exam trap

AWS often tests the misconception that low temperature (e.g., 0.1) is 'deterministic enough,' but only temperature exactly 0 guarantees deterministic outputs, and top_p must be 1 to avoid interfering with the argmax selection.

How to eliminate wrong answers

Option A is wrong because temperature 0.1 still introduces slight randomness, making outputs non-deterministic and unsuitable for auditing. Option B is wrong because temperature 0.7 introduces significant randomness, and top_p 1.0 does not constrain it, leading to high variability. Option C is wrong because temperature 0.5 introduces randomness, and top_p 0.5 further restricts token sampling but does not eliminate the stochastic behavior from the nonzero temperature.

779
MCQmedium

A team has built a regression model to predict house prices. The RMSE is 50,000 on the test set. Which action is most appropriate to improve model performance?

A.Remove outliers from training data
B.Apply feature scaling
C.Add more relevant features
D.Use a different evaluation metric
AnswerC

Adding informative features can reduce bias and improve model accuracy.

Why this answer

Option C is correct because adding relevant features can capture more patterns and improve predictive accuracy. Using a different metric (A) does not improve the model. Removing outliers (B) may help if outliers exist, but adding features is generally a more systematic improvement.

Feature scaling (D) helps some algorithms but may not be the primary issue.

780
Multi-Selecthard

A team is designing a RAG system on Amazon Bedrock. They need to chunk a large set of PDF documents into smaller pieces for embedding. Which THREE considerations should guide their chunking strategy? (Choose three.)

Select 3 answers
A.All chunks must be exactly the same length for optimal performance
B.Chunk size should be small enough to fit multiple chunks within the model's context window after including the query
C.Consider the embedding model's maximum input token limit
D.Overlapping chunks should be avoided to reduce redundancy
E.Chunks should align with natural semantic boundaries (e.g., paragraphs, sections)
AnswersB, C, E

Smaller chunks allow more retrieved pieces to fit in the context window, improving answer completeness.

Why this answer

Chunk size affects retrieval accuracy and context window usage; overlapping chunks prevent information loss at boundaries; semantic boundaries improve coherence.

781
MCQhard

An AI practitioner is fine-tuning an Amazon Titan Text model on a dataset of customer support conversations to improve response accuracy. After training, the model's perplexity on the validation set is low, but during inference, the model frequently generates off-topic or nonsensical responses to real customer queries. What is the most likely cause?

A.The model's context window is too small for the inference queries
B.The temperature during inference is set too low
C.The fine-tuning dataset is too small, causing overfitting
D.The validation set does not represent the distribution of real customer queries
AnswerD

If the validation set is similar to training data but different from real-world inputs, the model may appear good on validation but fail in production.

Why this answer

Low perplexity on a held-out validation set indicates the model fits the fine-tuning distribution well, but if the distribution of real queries differs from the training data, the model may not generalize. Perplexity is not a perfect measure of real-world performance.

782
MCQeasy

A developer needs to preprocess a dataset consisting of customer reviews for sentiment analysis. Which text preprocessing technique is most likely to improve model accuracy?

A.Stemming
B.All of the above
C.Removing stop words
D.Lowercasing
AnswerB

Combining lowercasing, stop word removal, and stemming is a common and effective preprocessing pipeline.

Why this answer

Option B is correct because all three listed techniques—stemming, removing stop words, and lowercasing—are standard text preprocessing steps that collectively improve model accuracy for sentiment analysis. Stemming reduces words to root forms to consolidate similar meanings, removing stop words eliminates noise from high-frequency but low-information tokens, and lowercasing normalizes case variations. Together, they reduce the feature space and help the model focus on sentiment-bearing terms, leading to better generalization and accuracy.

Exam trap

The AIF-C01 exam often tests the misconception that a single preprocessing step is sufficient, when in fact the combination of all three—stemming, stop word removal, and lowercasing—is standard practice for maximizing model accuracy in NLP tasks like sentiment analysis.

How to eliminate wrong answers

Option A is wrong because stemming alone is insufficient; while it helps consolidate word variants, it does not address noise from stop words or case sensitivity, so it is not the single most likely technique to improve accuracy. Option C is wrong because removing stop words alone reduces noise but ignores the benefits of stemming and lowercasing, which are also critical for handling morphological variations and case mismatches. Option D is wrong because lowercasing alone normalizes case but does not handle word root consolidation or removal of irrelevant high-frequency words, leaving significant noise in the feature set.

783
MCQeasy

A data scientist trains a linear regression model to predict housing prices. The model achieves a low training error but a high test error. Which concept does this BEST illustrate?

A.Bias-variance tradeoff
B.Regularization
C.Underfitting
D.Overfitting
AnswerD

Low training error with high test error is the classic sign of overfitting.

Why this answer

Overfitting occurs when the model learns noise in the training data, performing well on training data but poorly on unseen test data.

784
MCQeasy

What is the primary role of the self-attention mechanism in the Transformer architecture?

A.To allow each token to attend to all other tokens in the sequence, capturing long-range dependencies
B.To process tokens in parallel by alternating attention and feed-forward layers
C.To reduce the vocabulary size by mapping tokens to embeddings
D.To generate the next token one at a time in an autoregressive manner
AnswerA

Self-attention computes attention scores between all token pairs, enabling the model to capture context from distant positions.

Why this answer

Self-attention computes dependencies between all positions in a sequence, allowing the model to weigh the relevance of each token to every other token.

785
Multi-Selectmedium

A company is using Amazon SageMaker to train machine learning models. The security team wants to ensure that the training data is encrypted at rest and that the SageMaker notebook instances cannot access the internet. Which TWO actions should the company take? (Choose TWO.)

Select 2 answers
A.Enable S3 server-side encryption with AWS KMS (SSE-KMS) for the training data bucket
B.Create an AWS CloudTrail trail to log all S3 data events
C.Enable encryption at rest for the SageMaker endpoint using the AWS Management Console
D.Disable internet access for the SageMaker notebook instance by placing it in a VPC without a NAT gateway or internet gateway
E.Use AWS Security Token Service (STS) to generate temporary credentials for the notebook instance
AnswersA, D

SSE-KMS encrypts objects at rest using KMS keys.

Why this answer

Option A is correct because enabling S3 server-side encryption with AWS KMS (SSE-KMS) ensures that the training data stored in the S3 bucket is encrypted at rest. This satisfies the security team's requirement for data encryption at rest, as SSE-KMS provides envelope encryption with a customer-managed or AWS-managed KMS key, giving the company control over the encryption keys and auditability via AWS CloudTrail.

Exam trap

The trap here is that candidates often confuse encryption at rest for the endpoint (Option C) with encryption of the training data in S3, or they mistakenly think that CloudTrail logging (Option B) or STS credentials (Option E) provide encryption, when in fact they address auditing and access control, not data encryption.

786
MCQmedium

A financial services firm uses Amazon Bedrock to generate investment summaries. They need to prevent the model from generating content containing personally identifiable information (PII) such as social security numbers. Which feature should they configure in Bedrock Guardrails?

A.Contextual grounding checks
B.Content filtering with category-based harmful content filters
C.Sensitive information filters with PII redaction
D.Word filters with a custom list of terms
AnswerC

Sensitive information filters in Guardrails include PII detection and redaction, which can block or mask PII.

Why this answer

Bedrock Guardrails include a PII redaction filter that can detect and block or mask PII in model inputs and outputs.

787
MCQhard

Refer to the exhibit. A team is creating an IAM policy for a SageMaker notebook user. The user needs to access training data in an S3 bucket and create models. Which responsible AI concern is most relevant to this policy?

A.The policy does not enforce encryption for the notebook.
B.The policy does not restrict which S3 buckets the user can read.
C.The policy does not include a condition for model explainability.
D.The policy grants overly broad permissions, violating the principle of least privilege.
AnswerD

Allowing CreateModel and CreateNotebookInstance on all resources can lead to misuse.

Why this answer

Option C is correct. The policy grants broad access (sagemaker:CreateModel and sagemaker:CreateNotebookInstance on all resources) without restrictions. This could allow a user to create models using any data or expose the notebook.

The least privilege principle is violated, leading to potential unintended model creation or data exposure. Options A and B are less directly related; D is about explainability.

788
MCQmedium

An organization is required to provide transparency about AI-generated content. Which of the following is the best practice to comply with transparency requirements?

A.Clearly label AI-generated content with a disclosure statement
B.Store metadata but not display it to users
C.Use a watermark that is invisible to users
D.Only disclose AI generation if the content is inaccurate
AnswerA

Labeling content as AI-generated meets transparency requirements.

Why this answer

Clear disclosure that content is AI-generated is a fundamental transparency practice, helping users understand the origin of the content.

789
MCQmedium

A company uses an AI system to screen job applications. The system was trained on resumes from previous hires, which predominantly came from a specific demographic. As a result, the system may unfairly filter out qualified candidates from other backgrounds. Which responsible AI practice should the company implement?

A.Implement bias detection metrics and monitor outcomes by demographic groups
B.Focus solely on improving the model's precision and recall
C.Defer all screening decisions to a human recruiter
D.Increase the size of the training dataset without regard to demographic composition
AnswerA

Bias detection and monitoring help identify and correct unfair outcomes.

Why this answer

To mitigate bias, the company should measure and monitor the system's impact across demographic groups. This aligns with fairness metrics. Using more data without addressing bias may not help.

Relying on human review is good but does not guarantee systematic fairness. Focusing only on performance ignores fairness.

790
Multi-Selectmedium

A company uses Amazon Bedrock Agents to automate a multi-step customer support workflow. The agent needs to query a customer database and update a ticket system. Which TWO components are required to enable the agent to interact with these external systems?

Select 2 answers
A.A vector store like Amazon OpenSearch Serverless
B.Bedrock Guardrails
C.Bedrock Knowledge Bases
D.Lambda functions that implement the business logic for each action
E.Action groups that define the APIs or database operations
AnswersD, E

Lambda functions contain the code to actually query the database and update the ticket system.

Why this answer

Action groups define the functions the agent can call, and Lambda functions implement the business logic to interact with databases and APIs. Knowledge Bases provide static information, no tool use. Guardrails filter content, not execute actions.

Vector stores store embeddings.

791
MCQeasy

An application needs to store and search vector embeddings of 10 million documents for a RAG system. Which Amazon vector store is a fully managed, serverless option that integrates natively with Amazon Bedrock Knowledge Bases?

A.Amazon Aurora PostgreSQL with pgvector
B.MongoDB Atlas
C.Amazon OpenSearch Serverless
D.Pinecone
AnswerC

OpenSearch Serverless is a serverless vector database with native Bedrock Knowledge Bases integration for ingestion and search.

Why this answer

Amazon OpenSearch Serverless is a fully managed, serverless vector store that integrates natively with Bedrock Knowledge Bases. Aurora supports pgvector but is not serverless in the same sense (provisioned). Pinecone and MongoDB are third‑party.

792
MCQeasy

A company wants to predict customer churn. They have historical data with features like usage minutes, support tickets, contract length. The target is binary: churn/not churn. Which ML algorithm is best suited?

A.Logistic regression
B.Principal Component Analysis (PCA)
C.Linear regression
D.K-means clustering
AnswerA

Logistic regression models the probability of a binary outcome using a logistic function.

Why this answer

Logistic regression is the best choice because it is specifically designed for binary classification tasks like predicting churn (churn/not churn). It models the probability of the target class using a logistic (sigmoid) function, making it interpretable and efficient for this type of supervised learning problem with a categorical outcome.

Exam trap

The AIF-C01 exam often tests the distinction between supervised and unsupervised learning, and the trap here is that candidates may confuse dimensionality reduction (PCA) or clustering (K-means) with classification, or mistakenly apply linear regression to a binary outcome without recognizing the need for a logistic function.

How to eliminate wrong answers

Option B is wrong because Principal Component Analysis (PCA) is an unsupervised dimensionality reduction technique, not a classification algorithm; it reduces feature space but does not predict a binary target. Option C is wrong because linear regression predicts a continuous numeric output, not a binary class; using it for classification would violate the assumption of normally distributed errors and produce unbounded predictions. Option D is wrong because K-means clustering is an unsupervised learning algorithm used for grouping unlabeled data into clusters, not for predicting a known binary target variable.

793
MCQmedium

A company is using Amazon Bedrock to build an application that requires very low latency responses (under 100ms). They are currently using a large model but need faster inference. Which model selection strategy is MOST appropriate?

A.Increase the temperature parameter to speed up generation
B.Use a larger model with more parameters for better accuracy
C.Switch to a smaller model that can meet the latency requirement while still providing acceptable quality
D.Use batch processing instead of real-time streaming
AnswerC

Smaller models have lower computational cost and faster inference, reducing latency.

Why this answer

Smaller models typically have lower latency due to fewer parameters and faster inference. Switching to a smaller model can meet latency requirements.

794
MCQeasy

A financial services company uses Amazon Rekognition to verify customer identities. To ensure responsible AI practices, which measure should the company prioritize?

A.Use only black-box models to protect intellectual property
B.Increase model complexity to improve accuracy
C.Minimize the amount of training data collected
D.Regularly audit the model for demographic bias
AnswerD

Bias audits are essential for fairness.

Why this answer

Option D is correct because regularly auditing the model for demographic bias is a core responsible AI practice, especially for identity verification systems where biased outcomes could lead to unfair treatment of certain customer groups. Amazon Rekognition's facial analysis and comparison features must be tested across diverse demographics to ensure equitable performance, as bias can arise from imbalanced training data or algorithmic artifacts.

Exam trap

The trap here is that candidates may confuse 'responsible AI' with generic model optimization (like increasing accuracy or reducing data), but the exam specifically tests the principle of fairness through bias auditing and transparency.

How to eliminate wrong answers

Option A is wrong because using only black-box models contradicts responsible AI principles; explainability and transparency are critical for auditing bias and ensuring fairness, and black-box models obscure how decisions are made, making it harder to detect issues. Option B is wrong because increasing model complexity does not inherently improve accuracy and can amplify bias or reduce interpretability; responsible AI prioritizes balanced performance and fairness over raw accuracy. Option C is wrong because minimizing training data can exacerbate bias by underrepresenting certain demographic groups, leading to poor generalization and unfair outcomes; responsible AI requires diverse, representative datasets.

795
Multi-Selectmedium

A company is building a content generation application using Amazon Bedrock. They need to ensure that the model does not generate offensive content and also avoids discussing certain prohibited topics. Which TWO Bedrock features should be combined to achieve this?

Select 2 answers
A.Bedrock Model Evaluation
B.Bedrock Guardrails topic denial
C.Bedrock Guardrails content filters
D.Bedrock Knowledge Bases
E.Bedrock Agents
AnswersB, C

Topic denial prevents the model from discussing specified prohibited topics.

Why this answer

Bedrock Guardrails provide content filtering (offensive content) and topic denial (prohibited topics). Knowledge Bases are for RAG, not content control. Agents orchestrate tasks.

Model evaluation tests performance. The correct combination is content filtering and topic denial, both part of Guardrails.

796
Multi-Selectmedium

A company is building a legal document search system. They need to find relevant documents based on natural language queries. Which TWO AWS services or features should they combine to implement this? (Select TWO.)

Answer options not yet available.

Why this answer

Amazon Bedrock's embedding model generates vector embeddings, and a vector database (e.g., Amazon Aurora with pgvector, Amazon OpenSearch Serverless with vector engine) stores and searches those embeddings.

797
MCQhard

Refer to the exhibit. A developer sees this error when calling Amazon Bedrock for inference. What is the MOST likely cause and recommended solution?

A.The model ID is incorrect; use a different model
B.The prompt is too long; reduce the number of tokens in the prompt
C.The request rate exceeds the model's throughput limit; implement retries with exponential backoff
D.Increase the max_tokens_to_sample value
AnswerC

Throttling is due to rate limits; exponential backoff handles it.

Why this answer

Option A is correct. The error indicates throttling (rate exceeded). Retries with exponential backoff handle transient throttling.

Option B (fix prompt) is unrelated. Option C (change model) not needed. Option D (increase max_tokens) could exacerbate the issue.

798
Multi-Selecthard

Which TWO of the following are key components of a responsible AI governance framework?

Select 2 answers
A.Develop and enforce AI ethics policies and standards
B.Focus solely on compliance with legal regulations
C.Minimize human involvement in AI lifecycle decisions
D.Conduct regular bias and fairness impact assessments
E.Deploy AI models as black boxes to avoid scrutiny
AnswersA, D

Policies provide the foundation for governance.

Why this answer

Option A is correct because a responsible AI governance framework must include the development and enforcement of AI ethics policies and standards to ensure alignment with societal values, fairness, and accountability. These policies guide the design, deployment, and monitoring of AI systems, embedding ethical principles such as transparency, privacy, and non-discrimination into the AI lifecycle. Without such policies, organizations risk deploying AI that violates ethical norms or regulatory expectations.

Exam trap

The AIF-C01 exam often tests the distinction between mere legal compliance and comprehensive ethical governance, trapping candidates who think that meeting regulatory requirements alone constitutes responsible AI, while ignoring proactive fairness and transparency measures.

799
MCQmedium

An AI practitioner is evaluating a text generation model and notices that the model sometimes produces plausible-sounding but factually incorrect statements. What is this phenomenon called?

A.Hallucination
B.Catastrophic forgetting
C.Bias amplification
D.Overfitting
AnswerA

Hallucination is the generation of false or nonsensical information that appears credible.

Why this answer

Hallucination in LLMs refers to generating content that is not grounded in the training data or provided context. It is a known challenge for generative models.

800
MCQmedium

A data scientist is using Amazon Bedrock to build a question-answering system over a large corpus of technical manuals. They want to ensure that the model's answers are grounded in the retrieved documents and that the model does not hallucinate. Which feature should they enable?

A.Embedding model with higher dimensionality
B.Larger chunk sizes in the knowledge base
C.Bedrock Guardrails with grounding support
D.Bedrock Agents with a multi-step reasoning prompt
AnswerC

Grounding checks ensure the model's output is supported by the provided context (retrieved documents), reducing ungrounded or hallucinated answers.

Why this answer

Bedrock Knowledge Bases provides source attribution, and when combined with model inference, the model can be instructed to answer only from the retrieved chunks. However, Bedrock Guardrails' grounding check specifically verifies that the model's response is supported by the retrieved context, reducing hallucination.

801
MCQeasy

A developer is using Amazon Bedrock's Claude model to summarize long documents. The developer notices that the summaries sometimes miss key points. Which parameter adjustment is most likely to improve summary completeness?

A.Increase the max_tokens parameter.
B.Increase the top_k parameter.
C.Increase the temperature parameter.
D.Increase the top_p parameter.
AnswerA

More tokens allow the model to include more details in the summary.

Why this answer

Increasing max_tokens allows the model to generate longer outputs, which is essential when summarizing long documents because the summary may need more tokens to capture all key points. If max_tokens is too low, the model truncates the response, potentially omitting important details. This directly addresses the issue of missing key points by providing sufficient output length for a complete summary.

Exam trap

The AIF-C01 exam often tests the misconception that parameters controlling randomness (temperature, top_k, top_p) affect output length or completeness, when in fact they only influence token selection diversity and creativity.

How to eliminate wrong answers

Option B is wrong because increasing top_k controls the number of highest-probability tokens considered during sampling, which affects randomness and diversity, not the length or completeness of the output. Option C is wrong because increasing temperature increases randomness in token selection, which can lead to more creative but less focused summaries, potentially worsening completeness. Option D is wrong because increasing top_p (nucleus sampling) also controls randomness by selecting tokens with cumulative probability, and does not extend the output length or guarantee inclusion of key points.

802
MCQhard

A company uses Amazon Bedrock with a Knowledge Base for RAG. Users report that the assistant gives incorrect answers for questions that require understanding of data tables. After reviewing, the team suspects the chunking strategy is breaking table structures. Which change would BEST preserve the integrity of tabular data?

A.Use a different vector store like Pinecone
B.Switch from fixed-size chunking to semantic chunking that respects table boundaries
C.Increase the chunk overlap to 50%
D.Decrease the chunk size to 100 tokens
AnswerB

Semantic chunking keeps tables intact, preventing fragmentation across chunks.

Why this answer

Preserving table structure is best achieved by chunking based on table boundaries (e.g., markdown table row groups) with larger chunk overlap to maintain context. Semantic chunking that keeps entire tables intact is ideal.

803
MCQmedium

A company is building a document summarization application using Amazon Bedrock. They want to prototype the application quickly by testing different models and prompts interactively. Which AWS service or feature should they use?

A.Amazon SageMaker Studio
B.Amazon Bedrock Playground
C.Amazon Bedrock Studio
D.AWS Cloud9
AnswerB

The Playground allows quick interactive testing of models and prompts, ideal for prototyping.

Why this answer

Amazon Bedrock Playground provides an interactive web-based interface to experiment with different models, prompts, and configurations without writing code.

804
MCQeasy

A company is building a chatbot to answer customer queries using Amazon Lex. The development team has created a large dataset of customer interactions and intends to use Amazon SageMaker to train a custom machine learning model for natural language understanding (NLU). The team wants to integrate the trained model with Amazon Lex to handle intents and slots. The team has limited experience with SageMaker and wants to minimize operational overhead. Which solution should the team use?

A.Encapsulate the custom model in a Docker container, push it to Amazon ECR, and create a custom machine learning resource in Amazon Lex to invoke the container directly.
B.Train a custom model in SageMaker using a built-in algorithm like BlazingText, then deploy it to a SageMaker endpoint and integrate with Lex via a AWS Lambda function that calls the endpoint.
C.Use Amazon Comprehend to perform sentiment analysis and entity recognition, then map the results to Lex intents using Lambda.
D.Use SageMaker Autopilot to automatically build and train the best model, then deploy to a SageMaker endpoint and use Lambda to invoke it for Lex integration.
AnswerD

SageMaker Autopilot automates the machine learning process, minimizing manual effort, and the trained model can be deployed to an endpoint and integrated with Lex via Lambda.

Why this answer

Option D is correct because SageMaker Autopilot automates model building, tuning, and deployment, reducing the need for manual intervention and expertise. Option A requires manual algorithm selection and tuning. Option B uses Amazon Comprehend, which provides general-purpose NLP but does not allow for custom NLU model training.

Option C is not supported because Amazon Lex does not directly invoke custom Docker containers; integration is typically done via Lambda.

805
MCQhard

A research team is using Amazon Bedrock to analyze scientific papers. They want the model to generate answers based only on papers published after 2023. Which approach should they use?

A.Fine-tune the model on a dataset of post-2023 papers and deploy it.
B.Set the maxTokens to a low value to force the model to rely on recent context.
C.Include a system prompt instructing the model to ignore data before 2023.
D.Use Amazon Bedrock Knowledge Bases with a metadata filter to retrieve only papers published after 2023, and generate responses based on retrieved content.
AnswerD

Metadata filtering ensures only relevant recent documents are used, grounding the model in current data.

Why this answer

Option D is correct because Amazon Bedrock Knowledge Bases with a metadata filter allows you to restrict retrieval to only documents that match specific metadata criteria, such as publication year. By filtering the vector search to only include papers published after 2023, the model generates responses based solely on that retrieved content, ensuring it does not rely on pre-2023 data. This approach is the only one that guarantees the model's answers are grounded exclusively in the specified time range.

Exam trap

AWS often tests the misconception that a system prompt or fine-tuning can reliably restrict a model's knowledge to a specific time period, when in fact only a retrieval-based approach with metadata filtering can enforce such temporal constraints.

How to eliminate wrong answers

Option A is wrong because fine-tuning the model on a dataset of post-2023 papers does not prevent the model from using its pre-existing training data (which includes pre-2023 knowledge) during inference; fine-tuning adjusts weights but does not erase prior knowledge, so the model could still generate answers based on older information. Option B is wrong because setting maxTokens to a low value limits the length of the generated response but does not control the temporal scope of the model's knowledge; the model can still draw on pre-2023 training data regardless of token count. Option C is wrong because a system prompt instructing the model to ignore data before 2023 is merely a suggestion and not a technical enforcement; the model has no inherent mechanism to filter its own training data by date, so it may still generate answers based on pre-2023 information, especially if the prompt is not strictly followed.

806
MCQmedium

A financial services firm wants to deploy a generative AI application that answers customer questions about account balances and recent transactions. The firm has strict latency requirements (responses under 2 seconds) and wants to minimize costs. Which strategy for model selection and deployment is MOST appropriate?

A.Select a smaller, faster foundation model (e.g., Amazon Titan Text Lite) and use on-demand inference
B.Use the largest available foundation model via on-demand inference for highest accuracy
C.Fine-tune a large model specifically on account data and deploy on a dedicated endpoint
D.Deploy a large model using Provisioned Throughput to guarantee low latency
AnswerA

A smaller model reduces latency and cost while still handling simple Q&A tasks well; on-demand avoids upfront costs.

Why this answer

For latency-sensitive and cost-conscious applications, selecting a smaller, faster model is preferable over a large model or custom deployment. Provisioned Throughput for dedicated capacity would increase cost and may not be needed if the base model performs adequately.

807
Multi-Selectmedium

A company is using an LLM to generate customer support responses. They want to reduce hallucinations and improve the accuracy of the responses. Which TWO approaches are most effective? (Select TWO.)

Select 2 answers
A.Increase the temperature parameter to 1.0
B.Use a smaller model to reduce complexity
C.Remove all system prompts
D.Apply Bedrock Guardrails with contextual grounding check
E.Use Retrieval-Augmented Generation (RAG) to retrieve relevant documents
AnswersD, E

Contextual grounding check verifies that the model's response is supported by the retrieved sources, filtering out unsupported claims.

Why this answer

RAG grounds the model in retrieved facts, while Bedrock Guardrails with contextual grounding check validates responses against sources. Both are proven techniques to reduce hallucinations.

808
Multi-Selecthard

A company is deploying a customer service chatbot using a large language model (LLM) via Amazon Bedrock. The application must meet high accuracy for domain-specific queries, low latency, and be cost-effective. Which TWO strategies should the company adopt to achieve these goals? (Choose two.)

Select 2 answers
A.Store user prompts in a shared cache to reuse common queries.
B.Fine-tune the model on a large corpus of customer service transcripts to improve domain knowledge.
C.Use a Retrieval-Augmented Generation (RAG) architecture with a vector database for domain context.
D.Select a smaller, faster model that trades some accuracy for throughput.
E.Increase the model's maximum token limit to handle longer customer queries.
AnswersA, C

Caching frequent queries reduces latency and cost by avoiding repeated model invocations.

Why this answer

Retrieval-Augmented Generation (RAG) provides domain-specific context without full fine-tuning, reducing cost and latency. Caching responses for common queries reduces latency. Option A is not necessarily cost-effective; fine-tuning is expensive and may be overkill.

Option B is not good practice; it reduces security. Option D is overkill for latency; model choice should be driven by capability, not just throughput.

809
Multi-Selecthard

An organization is evaluating different foundation models (FMs) on Amazon Bedrock for a legal document analysis task. Which THREE factors should they consider when selecting a model? (Choose 3.)

Select 3 answers
A.The region where the model is hosted
B.Model size (number of parameters)
C.Cost per inference call
D.Support for the specific language of the documents
E.Token limits for input and output
AnswersB, D, E

Larger models often have better understanding but higher cost.

Why this answer

Options A, B, and D are correct. Model size affects capability and cost, token limits determine the length of documents that can be processed, and language support is critical for legal documents. Option C (region) is not a model capability factor.

Option E (cost per inference) is operational but not a primary technical selection factor for this task.

810
MCQhard

A company is building a RAG application that indexes thousands of PDF documents. They notice that some documents are very long (hundreds of pages) and the vector search often returns irrelevant chunks. Which configuration change would MOST improve retrieval relevance?

A.Switch from Amazon OpenSearch Serverless to Pinecone
B.Increase the embedding dimension from 1024 to 4096
C.Use a larger, more capable foundation model for response generation
D.Adjust the chunk size and overlap to better capture context from the documents
AnswerD

Proper chunking ensures that each chunk contains complete, context-rich information, improving the relevance of retrieved passages.

Why this answer

Adjusting chunk size and overlap ensures that chunks contain coherent information. Increasing embedding dimension does not directly improve relevance, nor does changing the model size. Using a different vector store does not inherently fix chunking issues.

811
MCQmedium

A data scientist is training a neural network for image classification. The loss decreases rapidly for the first few epochs but then plateaus. Which technique is most likely to help the model continue improving?

A.Increase the number of epochs
B.Increase the batch size
C.Reduce the learning rate using a scheduler
D.Add more dropout layers
AnswerC

Learning rate decay can help the model escape plateaus and converge to a lower loss.

Why this answer

Reducing the learning rate allows the optimizer to take smaller steps, which can help the model converge to a better minimum after a plateau.

812
MCQmedium

A developer is using the Amazon Bedrock Converse API to build a multi-turn chatbot. They notice that after several exchanges, the model starts to forget earlier context. What is the MOST likely cause?

Answer options not yet available.

Why this answer

Each turn in the conversation increases total token count, and when it exceeds the model's context window, earlier messages are truncated.

813
MCQmedium

A machine learning team wants to detect bias in a deployed model's predictions on new data. They use Amazon SageMaker. Which service should they use to generate bias reports after deployment?

A.Amazon SageMaker Clarify
B.Amazon SageMaker Debugger
C.Amazon SageMaker Model Monitor
D.Amazon SageMaker Role Manager
AnswerA

Clarify can run bias metrics on predictions after deployment.

Why this answer

Amazon SageMaker Clarify provides bias detection and explainability for ML models, both during training and after deployment. SageMaker Model Monitor detects data drift but not bias. SageMaker Debugger is for training debugging.

SageMaker Role Manager is for managing IAM roles.

814
MCQeasy

Which AWS service provides a serverless API for accessing foundation models with per-token pricing?

A.Amazon Bedrock
B.Amazon API Gateway
C.AWS Lambda
D.Amazon SageMaker
AnswerA

Bedrock provides a serverless API with per-token pricing.

Why this answer

Amazon Bedrock is the managed service that offers a serverless API for foundation models, charging per token.

815
MCQhard

A security engineer is configuring logging for Amazon Bedrock model invocations. They need to capture both the input and output of all API calls for compliance audits. Which set of steps should they take?

A.Enable Bedrock model invocation logging and specify an S3 bucket and optionally CloudWatch Logs as the destination
B.Enable CloudTrail for the Bedrock API and configure S3 event notifications
C.Use VPC Flow Logs to capture network traffic and reconstruct model inputs from packet data
D.Enable AWS Config rules for Bedrock and stream logs to Amazon Kinesis
AnswerA

Model invocation logging captures the content of requests and responses and stores them in S3 or CloudWatch.

Why this answer

Bedrock model invocation logging captures inputs and outputs to S3 and/or CloudWatch Logs. CloudTrail records API calls but not the model inputs/outputs.

816
Multi-Selecthard

A company wants to enforce strict data residency for training data used in SageMaker. The data must never leave a specific AWS Region. Which THREE actions should they take? (Choose 3)

Select 3 answers
A.Use KMS customer managed keys specific to the region
B.Use AWS KMS multi-Region keys to encrypt data
C.Use S3 bucket policies to deny access if the request originates from outside the region
D.Enable cross-region replication for the S3 bucket
E.Create a VPC in the desired region and place all SageMaker resources in that VPC
AnswersA, C, E

Regional KMS keys ensure data can only be decrypted in that region.

Why this answer

To enforce data residency, the company must use a VPC in the desired region, configure S3 bucket policies to block cross-region access, and use KMS keys regional to that region.

817
MCQmedium

An IAM policy allows creation of SageMaker training jobs only if they use a specific VPC security group. A user tries to create a training job without specifying that security group. What will happen?

A.The request will succeed but SageMaker will ignore the condition
B.The request will succeed because the condition is optional
C.The request will be denied because the training job resource ARN is invalid
D.The request will be denied with an AccessDenied error
AnswerD

The IAM condition is not satisfied, so the request is denied.

Why this answer

Option D is correct because IAM policies are evaluated before any AWS API action is executed. If the policy includes a condition that requires a specific VPC security group for SageMaker training jobs, and the user's request does not include that security group, the condition is not met, resulting in an explicit deny (AccessDenied error). AWS IAM denies the request by default if the condition in a policy is not satisfied, regardless of whether the condition is marked as optional in the API.

Exam trap

The trap here is that candidates assume an optional API parameter means the IAM condition is also optional, but IAM conditions are strictly enforced regardless of whether the parameter is required by the API.

How to eliminate wrong answers

Option A is wrong because IAM policies do not ignore conditions; if a condition is not met, the request is denied, not silently ignored. Option B is wrong because the condition is not optional from an IAM perspective; even if the API parameter is optional, the IAM policy condition must be satisfied for the request to be allowed. Option C is wrong because the training job resource ARN is not invalid; the request is denied due to the policy condition, not due to an ARN format issue.

818
MCQeasy

A data science team is fine-tuning a Llama 2 7B model on Amazon SageMaker for a text classification task. After the first training run, they notice the loss is not decreasing and the model is overfitting to the small training set. What should the team change to mitigate overfitting?

A.Add dropout layers and reduce the learning rate.
B.Increase the number of epochs to allow the model to learn more patterns.
C.Increase the batch size and use gradient accumulation.
D.Remove dropout layers from the model architecture.
AnswerA

Dropout randomly drops neurons to prevent co-adaptation, and a lower learning rate helps stabilize training, both reducing overfitting.

Why this answer

Option D is correct because increasing dropout and reducing learning rate are standard regularization techniques. Option A is wrong because increasing batch size can slightly regularize but often insufficiently. Option B is wrong because increasing epochs typically worsens overfitting.

Option C is wrong because removing dropout reduces regularization, worsening overfitting.

819
MCQhard

A company has built a RAG application using Amazon Bedrock Knowledge Bases. Users report that answers are sometimes based on irrelevant or incorrect document chunks. The team has verified that the embedding model is appropriate and the documents are correctly indexed. What is the MOST likely cause of the poor retrieval quality?

A.The foundation model is too small
B.The prompt template is missing instructions
C.The vector store is too slow
D.The chunking strategy is suboptimal
AnswerD

Chunking size, overlap, and strategy heavily impact which chunks are retrieved. Incorrect chunking can lead to irrelevant results even with good embeddings.

Why this answer

Chunking strategy directly affects retrieval relevance. If chunks are too large, they may contain irrelevant information; if too small, they may miss context. Overlap size also matters.

Optimizing chunking often fixes relevance issues when embeddings are correct.

820
Multi-Selecteasy

Which TWO AWS services can be used to monitor and detect security anomalies in Amazon SageMaker model inference data? (Choose TWO.)

Select 2 answers
A.Amazon Macie
B.AWS CloudTrail
C.Amazon CodeGuru Security
D.Amazon SageMaker Model Monitor
E.Amazon CloudWatch Logs
AnswersD, E

Model Monitor detects data drift and anomalies in inference data.

Why this answer

Amazon SageMaker Model Monitor is specifically designed to detect deviations in model quality, such as data drift and feature attribution drift, by continuously monitoring inference data against a baseline. Amazon CloudWatch Logs can be used to capture and analyze inference request logs, enabling custom anomaly detection through log-based metrics and alarms. Together, they provide a comprehensive approach to monitoring security anomalies in SageMaker model inference data.

Exam trap

The trap here is that candidates often confuse AWS CloudTrail (API auditing) with CloudWatch Logs (log monitoring), or assume Macie can monitor any data flow, when it is restricted to S3 object-level sensitive data discovery.

821
MCQeasy

Refer to the exhibit. A developer is reviewing CloudWatch Logs for a deployed model and notices the same input appears multiple times with slightly different probabilities. What responsible AI concern does this pattern suggest?

A.The model is overfitting to the training data.
B.The model is not robust; it produces inconsistent predictions for the same input.
C.The model is exhibiting bias against a demographic group.
D.The input data is drifting from the training distribution.
AnswerB

Identical inputs should yield identical outputs; variation indicates instability.

Why this answer

Option B is correct. Repeated identical inputs with different predictions indicate model instability (lack of robustness). This could be due to randomness in the model or adversarial conditions.

Option A is irrelevant; C is possible but not the primary concern; D is about data drift, but input is same.

822
MCQhard

A healthcare company uses Amazon SageMaker to train a model that predicts patient readmission risk based on electronic health records (EHRs) stored in Amazon HealthLake. The training dataset contains 2 million records from the past three years, with a significant gender imbalance: 70% male and 30% female. The model achieved high overall accuracy, but further analysis using SageMaker Clarify revealed that the precision for female patients is 0.65 while for male patients it is 0.88. Additionally, the model's false positive rate for female patients is significantly higher. The company must comply with healthcare regulations that require fairness and non-discrimination. The data science team has already used SageMaker Data Wrangler for initial preprocessing and SageMaker Clarify for bias detection. They need to take immediate action to mitigate the bias before deploying to production. Which course of action should the team take?

A.Use SageMaker Clarify's bias mitigation feature to apply reweighing techniques and retrain the model with adjusted sample weights.
B.Use SageMaker Clarify to generate SHAP values and adjust the model's feature importance by removing biased features.
C.Use SMOTE (Synthetic Minority Oversampling Technique) to balance the training dataset before retraining.
D.Use SageMaker Model Monitor to detect feature drift and automatically retrain the model with updated data.
AnswerA

This directly mitigates bias by reweighting training samples to reduce disparity.

Why this answer

The correct answer is to use SageMaker Clarify's built-in bias mitigation technique (reweighing) as it directly addresses the disparity by adjusting sample weights during training. Option A: Model Monitor is for monitoring drift, not mitigation. Option B: SHAP values explain predictions but do not change model behavior.

Option C: SMOTE addresses class imbalance but not fairness in terms of group accuracy disparity; it may even worsen bias. Therefore, D is the best choice.

823
MCQmedium

A company uses Amazon Bedrock to generate marketing copy. They want to measure the quality of generated text compared to reference text. Which metric is most appropriate?

A.F1 score
B.BLEU
C.RMSE
D.Accuracy
AnswerB

BLEU calculates n-gram overlap between candidate and reference text, suitable for generation evaluation.

Why this answer

BLEU (Bilingual Evaluation Understudy) is the most appropriate metric for evaluating the quality of generated text against reference text in tasks like machine translation and text generation. It measures n-gram precision between the generated and reference texts, making it ideal for assessing marketing copy generated by Amazon Bedrock.

Exam trap

AWS often tests the distinction between classification/regression metrics and text generation metrics, leading candidates to mistakenly apply F1 score or accuracy to evaluate generated text quality instead of using BLEU or similar sequence-based metrics.

How to eliminate wrong answers

Option A is wrong because F1 score is a classification metric that measures harmonic mean of precision and recall, not suitable for evaluating text generation quality against reference text. Option C is wrong because RMSE (Root Mean Square Error) is a regression metric used for continuous numerical predictions, not for text or sequence evaluation. Option D is wrong because Accuracy is a classification metric that measures the proportion of correct predictions, which does not account for the sequential and linguistic nuances of generated text.

824
MCQmedium

A company is developing a chatbot using Amazon Bedrock and wants to ensure the model's responses do not include toxic or biased language. The company has a labeled dataset of undesirable responses. Which approach should be used to fine-tune the foundation model to reduce harmful outputs?

A.Use reinforcement learning from human feedback (RLHF) with a reward model trained on human preferences.
B.Perform supervised fine-tuning on a curated dataset of safe responses.
C.Use prompt engineering to instruct the model to avoid toxic language.
D.Implement adversarial validation by testing against toxic inputs.
AnswerA

RLHF uses human feedback to train a reward model, which then guides the base model to generate safer outputs.

Why this answer

Reinforcement learning from human feedback (RLHF) is the correct approach because it directly optimizes the model to avoid toxic or biased outputs by training a reward model on human-labeled preferences. The reward model scores the model's responses, and the foundation model is fine-tuned via reinforcement learning to maximize these scores, effectively reducing harmful language. This method is specifically designed to align model behavior with nuanced human values, such as avoiding toxicity, which supervised fine-tuning alone cannot guarantee.

Exam trap

The AIF-C01 exam often tests the misconception that supervised fine-tuning or prompt engineering alone can reliably eliminate harmful outputs, when in fact RLHF is required to align the model with nuanced human preferences through iterative feedback.

How to eliminate wrong answers

Option B is wrong because supervised fine-tuning on a curated dataset of safe responses teaches the model to mimic safe patterns but does not explicitly penalize toxic outputs during generation; it lacks a reward signal to discourage harmful language when the model deviates from the training distribution. Option C is wrong because prompt engineering is a static, instruction-based technique that can be easily bypassed by adversarial inputs or subtle variations in phrasing; it does not modify the model's internal weights to reliably avoid toxic language. Option D is wrong because adversarial validation only tests the model's robustness to toxic inputs without fine-tuning the model itself; it identifies vulnerabilities but does not reduce harmful outputs in production.

825
MCQeasy

A startup is building a code generation assistant using a large language model. They want to evaluate the quality of generated code compared to reference implementations. Which automated metric is MOST suitable for this task?

A.BERTScore
B.BLEU
C.ROUGE
D.Accuracy
AnswerB

BLEU measures n-gram precision and is widely used for code translation/generation tasks.

Why this answer

BLEU is commonly used for code generation evaluation, measuring n-gram overlap with reference code. ROUGE is for summarization, BERTScore for semantic similarity, and accuracy is for classification.

Page 10

Page 11 of 14

Page 12