Knowledge + Practice

AWS Certified AI Practitioner AIF-C01 (AIF-C01) — Questions 376–450

500 questions total · 7pages · All types, answers revealed

Take a mock exam Exam hub

Page 6 of 7

376

MCQhard

Refer to the exhibit. A developer deploys this CloudFormation stack but the agent fails to query the knowledge base. What is a likely cause?

A.The KnowledgeBaseId is not passed correctly

B.The agent role does not have permissions to invoke the knowledge base

C.The embedding model is not available in the region

D.The OpenSearch collection type should be SEARCH not VECTORSEARCH

AnswerB

The agent's IAM role must have bedrock:InvokeKnowledgeBase permission.

Why this answer

The correct answer is B because the agent role must have an IAM policy that grants the `bedrock:Retrieve` and `bedrock:RetrieveAndGenerate` permissions on the knowledge base. Without these permissions, the agent cannot invoke the knowledge base, even if the KnowledgeBaseId is correctly passed and the embedding model is available.

Exam trap

AWS often tests the distinction between resource creation permissions and runtime invocation permissions, trapping candidates who assume that a successful stack deployment implies all runtime permissions are correctly configured.

How to eliminate wrong answers

Option A is wrong because if the KnowledgeBaseId were not passed correctly, the stack would likely fail during creation or the agent would receive a different error (e.g., resource not found), not a generic failure to query. Option C is wrong because if the embedding model were not available in the region, the CloudFormation stack itself would fail during creation of the knowledge base, not during a subsequent query. Option D is wrong because the OpenSearch collection type for a knowledge base must be `VECTORSEARCH` to store and query vector embeddings; `SEARCH` is used for full-text search and does not support the vector similarity search required by the knowledge base.

Full explanation →

377

MCQmedium

A company is using Amazon Bedrock to build a text-to-SQL application. They want to ensure that the generated SQL queries are valid and safe. Which approach is BEST?

A.Fine-tune the model on a dataset of valid SQL queries

B.Use a separate model to validate the SQL after generation

C.Configure a guardrail to filter and validate the generated SQL

D.Limit the max_tokens to 50 to reduce complexity

AnswerC

Guardrails can enforce rules and reject invalid queries.

Why this answer

Amazon Bedrock guardrails provide a native, configurable mechanism to filter and validate model outputs, including SQL queries, against defined policies such as regex patterns, denied topics, and content filters. This approach directly addresses both validity and safety without requiring additional model training or external validation services, making it the most integrated and efficient solution for ensuring generated SQL is syntactically correct and free of harmful operations like DROP or DELETE.

Exam trap

Cisco often tests the misconception that fine-tuning or output length limits can solve safety and validity requirements, when in fact guardrails are the purpose-built AWS service for content filtering and validation at inference time.

How to eliminate wrong answers

Option A is wrong because fine-tuning on valid SQL queries improves the model's ability to generate syntactically correct SQL but does not guarantee safety; a fine-tuned model can still produce harmful queries (e.g., DROP TABLE) if the training data includes such patterns or if the model generalizes incorrectly. Option B is wrong because using a separate model for validation introduces additional latency, cost, and complexity, and still requires a policy or rule set to define what constitutes 'valid and safe'—which is exactly what Bedrock guardrails already provide natively. Option D is wrong because limiting max_tokens to 50 does not ensure SQL validity or safety; it only truncates output, potentially producing incomplete or syntactically invalid SQL, and does not prevent generation of dangerous commands.

Full explanation →

378

MCQeasy

A developer is testing different prompts for a text generation model on Amazon Bedrock. Which parameter controls the randomness of the model's output?

A.top_p

B.stop_sequences

C.temperature

D.max_tokens

AnswerC

Temperature directly scales the logits before softmax, controlling the randomness of token selection.

Why this answer

Option D is correct because temperature controls the randomness of the model's predictions. Lower values make output more deterministic; higher values increase randomness. Option A (max_tokens) controls output length.

Option B (top_p) is nucleus sampling. Option C (stop_sequences) defines stopping criteria.

Full explanation →

379

MCQhard

An organization is using Amazon Bedrock to power a customer service chatbot. They notice that the chatbot occasionally generates hallucinated information about product specifications. Which strategy should be implemented to reduce hallucinations?

A.Fine-tune the model on a dataset of product specification conversations.

B.Integrate a Retrieval Augmented Generation (RAG) system with the product catalog.

C.Use more detailed prompts with explicit instructions to avoid speculation.

D.Increase the temperature parameter to make outputs more conservative.

AnswerB

RAG provides up-to-date, factual context to the model, reducing hallucinations.

Why this answer

Retrieval Augmented Generation (RAG) grounds the model's responses in authoritative, up-to-date product catalog data, directly reducing hallucinations by ensuring the chatbot references verified facts rather than relying solely on its parametric memory. This is the most effective strategy because it provides a retrieval-based factual foundation that fine-tuning or prompt engineering alone cannot guarantee.

Exam trap

Cisco often tests the misconception that prompt engineering or fine-tuning alone can solve hallucination problems, when in fact they lack the dynamic, verifiable grounding that RAG provides.

How to eliminate wrong answers

Option A is wrong because fine-tuning on product specification conversations may reinforce patterns from the training data but does not prevent the model from generating plausible-sounding but incorrect details when faced with queries outside the fine-tuned distribution; it also cannot dynamically incorporate real-time catalog updates. Option C is wrong because while more detailed prompts can reduce speculation, they do not provide the model with access to external, authoritative data—hallucinations can still occur when the model's internal knowledge is incomplete or outdated. Option D is wrong because increasing the temperature parameter makes outputs more random and creative, not more conservative; decreasing temperature would make outputs more deterministic and less prone to hallucination, but even low temperature cannot eliminate hallucinations without a retrieval mechanism.

Full explanation →

380

MCQeasy

A company is using Amazon Bedrock to build a generative AI application. The company wants to prevent the model from generating toxic or harmful content while still allowing creative responses. Which feature should the company enable?

A.Amazon Bedrock Guardrails with content filters.

B.AWS Key Management Service (KMS) to encrypt model responses.

C.AWS Identity and Access Management (IAM) policies to restrict model output.

D.Amazon CloudWatch Logs to monitor and block harmful content.

AnswerA

Guardrails provide configurable content filters to block harmful output without overly restricting creativity.

Why this answer

Amazon Bedrock Guardrails with content filters is the correct feature because it allows the company to define and enforce policies that block toxic or harmful content in model inputs and outputs, while still permitting creative responses within safe boundaries. This feature provides configurable thresholds for content categories like hate, insults, and sexual content, enabling precise control over model behavior without restricting overall creativity.

Exam trap

The trap here is that candidates may confuse security services (like KMS for encryption or IAM for access control) with content moderation capabilities, assuming any AWS security service can filter model outputs, when in fact only Bedrock Guardrails provides purpose-built content filters for generative AI.

How to eliminate wrong answers

Option B is wrong because AWS KMS encrypts data at rest and in transit but does not inspect or filter model responses for toxic content; encryption ensures confidentiality, not content safety. Option C is wrong because IAM policies control access to AWS resources and actions (e.g., who can invoke a model) but cannot restrict the actual text output of a model; they are for authorization, not content moderation. Option D is wrong because Amazon CloudWatch Logs can monitor and store logs for analysis but cannot actively block harmful content in real-time; it is a logging and monitoring service, not a content filter.

Full explanation →

381

MCQeasy

A startup is deploying a foundation model on Amazon SageMaker for real-time inference. They notice high latency (over 2 seconds per request). Which action is most likely to reduce latency?

A.Enable auto-scaling on the SageMaker endpoint to handle more concurrent requests.

B.Switch to a smaller, distilled version of the model.

C.Deploy the model on a CPU-based instance instead of GPU.

D.Increase the batch size parameter in the inference request.

AnswerB

Smaller models have fewer parameters, reducing computation time and latency.

Why this answer

Option B is correct because using a smaller, distilled version of the model directly reduces the computational complexity per inference request. Distillation compresses the model by training a smaller student network to mimic a larger teacher model, resulting in fewer parameters and faster forward passes. This is the most direct way to cut latency when the model size is the bottleneck, as it reduces the number of floating-point operations (FLOPs) required per request.

Exam trap

AWS often tests the distinction between latency (time per single request) and throughput (requests per second), so candidates mistakenly choose auto-scaling or batch size increases, which improve throughput but not per-request latency.

How to eliminate wrong answers

Option A is wrong because enabling auto-scaling adds more endpoint instances to handle higher concurrency, but it does not reduce the latency of a single inference request; it only improves throughput under load. Option C is wrong because CPU-based instances are generally slower for deep learning inference than GPU instances, especially for large foundation models, so switching to CPU would increase latency, not reduce it. Option D is wrong because increasing the batch size in the inference request means processing multiple inputs together, which increases the time to first byte for each individual request and does not reduce per-request latency; it is a throughput optimization, not a latency reduction technique.

Full explanation →

382

Multi-Selecthard

Which TWO of the following are valid methods to reduce the risk of foundation models generating harmful or biased content?

Select 2 answers

A.Use a smaller model

B.Use a content filter

C.Apply prompt engineering to guide output

D.Fine-tune the model on a biased dataset

E.Disable all logging

AnswersB, C

Content filters can block harmful outputs.

Why this answer

Option B is correct because content filters act as a safety layer that intercepts and blocks harmful or biased outputs before they reach the user. These filters can be rule-based or use a separate classifier model trained to detect toxic, hateful, or biased language, reducing the risk of harmful content generation without altering the underlying model.

Exam trap

AWS often tests the misconception that simply using a smaller model or disabling logging can reduce bias, when in fact these actions either have no effect or worsen the problem, whereas content filters and prompt engineering are direct, effective mitigation strategies.

Full explanation →

383

MCQmedium

An e-commerce company uses Amazon Bedrock to generate product descriptions from keywords. Some descriptions contain inaccurate details about product specifications. Which approach should the company take to reduce factual errors?

A.Increase the maxTokens parameter to allow more detailed descriptions.

B.Use a different foundation model from Bedrock for each product category.

C.Deploy the model to a SageMaker endpoint and use human-in-the-loop validation.

D.Include the product specifications in the prompt and instruct the model to base the description on the provided data.

AnswerD

Providing facts in the prompt grounds the model's output and reduces fabrication.

Why this answer

Option D is correct because providing the product specifications directly in the prompt and instructing the model to base the description on that data grounds the generation in factual information, reducing hallucinations. This technique, known as prompt engineering with in-context learning, ensures the model uses the given data rather than relying on its training data, which may contain inaccuracies.

Exam trap

AWS often tests the misconception that increasing model parameters or changing models alone improves factual accuracy, when in fact prompt engineering with grounded data is the most effective and efficient method to reduce hallucinations.

How to eliminate wrong answers

Option A is wrong because increasing maxTokens only allows longer outputs but does not improve factual accuracy; it may even increase the chance of hallucinations by generating more unverified content. Option B is wrong because using a different foundation model for each category does not inherently reduce factual errors; all models can hallucinate, and this approach adds complexity without addressing the root cause of inaccurate specifications. Option C is wrong because deploying to a SageMaker endpoint with human-in-the-loop validation is an operational pattern for custom models, but it is overkill and inefficient for this use case; prompt engineering (Option D) is a simpler, more direct solution that avoids the latency and cost of human review for every generation.

Full explanation →

384

MCQeasy

A company wants to classify customer emails into categories (e.g., complaint, inquiry, feedback) using a foundation model. Which approach is MOST efficient?

A.Use Amazon Comprehend for custom classification

B.Train a custom model using Amazon SageMaker

C.Fine-tune a large language model on labeled emails

D.Use Amazon Lex with a classifier intent

AnswerA

Comprehend provides a ready-to-use classification API.

Why this answer

Amazon Comprehend provides a managed custom classification API that is purpose-built for text classification tasks like categorizing emails. It requires only a small set of labeled data to train a custom classifier, eliminating the need to manage infrastructure or fine-tune large models, making it the most efficient choice for this specific use case.

Exam trap

AWS often tests the misconception that any NLP task requires a large language model or custom training in SageMaker, when in fact managed services like Comprehend are optimized for common classification tasks and are more efficient.

How to eliminate wrong answers

Option B is wrong because training a custom model using Amazon SageMaker involves provisioning instances, managing training jobs, and handling model deployment, which is overkill and less efficient for a straightforward text classification task that can be handled by a managed service. Option C is wrong because fine-tuning a large language model (LLM) on labeled emails is computationally expensive, requires significant expertise in prompt engineering and hyperparameter tuning, and is not the most efficient approach when a simpler, purpose-built service like Comprehend exists. Option D is wrong because Amazon Lex is designed for building conversational chatbots and intent-based routing, not for batch or real-time text classification of emails; its classifier intent feature is meant for dialog management, not document categorization.

Full explanation →

385

MCQeasy

Refer to the exhibit. A security analyst is reviewing CloudTrail logs and notices a training job creation from an IP address (203.0.113.5) that is not associated with the company's network. What is the most likely cause?

A.The user john.doe is accessing the AWS Management Console from a VPN.

B.The CloudTrail log is being generated by a cross-account role.

C.The training job was created using the AWS CLI from an external machine.

D.The training job was created by a malicious actor who stole credentials.

AnswerA

A VPN would route traffic through an external IP; this is a common scenario for remote workers.

Why this answer

The IP address 203.0.113.5 is a non-routable test IP (RFC 5737) and not associated with the company's network. The most likely cause is that user john.doe is accessing the AWS Management Console through a VPN, which would route traffic through the VPN's public IP rather than the corporate network. This explains why the source IP appears external while the user identity is legitimate.

Exam trap

AWS often tests the distinction between 'external IP' and 'unauthorized access'—the trap here is assuming any external IP indicates a security breach, when in fact VPN usage is a legitimate and common cause for such logs.

How to eliminate wrong answers

Option B is wrong because cross-account roles would show the source IP of the role's session, not necessarily an external IP, and the log would include a 'userIdentity' with 'arn:aws:sts::...' indicating assumed role, which is not described. Option C is wrong because using the AWS CLI from an external machine would still show the machine's public IP, but the question states the IP is 'not associated with the company's network'—this is a plausible scenario but less likely than a VPN, as the user identity (john.doe) suggests a legitimate user, not an external machine. Option D is wrong because while stolen credentials are possible, the question asks for the 'most likely cause' given the context of a legitimate user identity; a malicious actor would typically not use a known corporate username without additional suspicious activity.

Full explanation →

386

MCQeasy

A developer wants to test different foundation models quickly without setting up infrastructure. Which AWS service allows interactive prompting and comparison of multiple models?

A.Amazon Comprehend

B.Amazon Bedrock Playground

C.Amazon Lex

D.Amazon SageMaker Studio

AnswerB

Bedrock offers a playground to interactively test and compare foundation models.

Why this answer

Amazon Bedrock provides a playground feature for testing models. Option A (SageMaker Studio) is for full notebook environment. Option C (Comprehend) is for analysis.

Option D (Lex) is for chatbots.

Full explanation →

387

MCQmedium

A data scientist is using Amazon SageMaker to train a model. The training job is taking longer than expected. Which change would most likely reduce training time?

A.Increase the number of training epochs

B.Use a larger batch size

C.Use a smaller instance type

D.Enable spot training

AnswerB

A larger batch size processes more samples per iteration, reducing the number of steps and overall time, provided the hardware supports it.

Why this answer

Using a larger batch size allows the model to process more training samples per iteration, which reduces the number of weight updates needed per epoch and can improve hardware utilization (e.g., GPU parallelism). This often leads to faster training times, provided the batch size fits within memory constraints and does not degrade model convergence.

Exam trap

Cisco often tests the misconception that reducing instance size or enabling spot instances directly improves training speed, when in fact these changes primarily affect cost or resource availability, not performance.

How to eliminate wrong answers

Option A is wrong because increasing the number of training epochs increases the total number of passes over the data, which would lengthen training time, not reduce it. Option C is wrong because using a smaller instance type reduces compute capacity (e.g., fewer vCPUs, less memory), which typically slows down training rather than speeding it up. Option D is wrong because enabling spot training (using Amazon EC2 Spot Instances) reduces cost but does not inherently reduce training time; it may even cause interruptions that delay completion.

Full explanation →

388

MCQeasy

A startup is building a customer support chatbot using Amazon Bedrock with the Claude foundation model. The chatbot needs to answer questions based on a knowledge base of frequently asked questions (FAQs) stored in an Amazon S3 bucket. The team wants to implement Retrieval Augmented Generation (RAG) to provide accurate and context-aware responses. They are evaluating different approaches to integrate the knowledge base. What is the most efficient way to implement RAG with Bedrock?

A.Use AWS Lambda to fetch documents from S3 and inject them into the prompt.

B.Manually extract all FAQs and include them in the prompt each time the chatbot responds.

C.Fine-tune the Claude model on the FAQs so the model memorizes the knowledge base.

D.Use Amazon Bedrock Knowledge Bases to directly connect the S3 bucket and retrieve relevant documents for the prompt.

AnswerD

Bedrock Knowledge Bases provides a managed RAG solution with automatic indexing and retrieval.

Why this answer

Option A is correct. Amazon Bedrock Knowledge Bases provides a native feature to connect to data sources like S3, automatically chunk and index documents, and retrieve relevant information. This is the most efficient and managed approach.

Option B is incorrect because manually including all FAQs in the prompt would exceed token limits and be impractical. Option C is incorrect because fine-tuning the model on FAQs is overkill for this use case and does not allow dynamic updates. Option D is a possible custom solution but is less efficient than using the built-in knowledge base feature.

Full explanation →

389

MCQmedium

A company is using Amazon Bedrock to generate images from text prompts. They need to ensure the generated images do not contain offensive content. Which feature should be enabled?

A.VPC endpoints

B.AWS WAF

C.Content moderation with AI

D.IAM policies

AnswerC

Bedrock's content moderation uses AI to detect and block offensive content.

Why this answer

Amazon Bedrock includes built-in content moderation that can filter harmful content in inputs and outputs. IAM policies (B) control access but not content. WAF (C) protects web applications.

VPC endpoints (D) secure network traffic.

Full explanation →

390

Multi-Selectmedium

Which TWO factors are most important when selecting a foundation model in Amazon Bedrock for a text summarization task with strict latency requirements?

Select 2 answers

A.Average response latency per request.

B.Model size in billions of parameters.

C.Maximum input token limit.

D.Output quality and token efficiency for summarization tasks.

E.Availability of fine-tuning capability for domain adaptation.

AnswersA, D

Low latency is critical for real-time summarization.

Why this answer

Options A and B are correct. Response latency directly impacts user experience, so a model with low latency is essential. Output quality/token ensures the summaries are accurate and concise.

Option C is wrong because fine-tuning increases cost and latency. Option D is wrong because model size affects latency but latency itself is the direct factor. Option E is wrong because input token limit is relevant but not as critical as latency and quality for this use case.

Full explanation →

391

Multi-Selecteasy

Which TWO actions can help reduce the likelihood of hallucinations in a generative AI model used for question answering?

Select 2 answers

A.Increase the maximum token count to allow more complete answers.

B.Use Retrieval Augmented Generation (RAG) with a trusted knowledge base.

C.Fine-tune the model on the training data used for the application.

D.Set a lower temperature parameter (e.g., 0.1) to reduce randomness.

E.Use a larger foundation model with more parameters.

AnswersB, D

Grounding on real documents reduces hallucinations.

Why this answer

Options A and C are correct. Grounding the model on a knowledge base (RAG) reduces hallucinations by providing factual context. Reducing the temperature parameter makes the model more deterministic, lowering the chance of making up information.

Option B is wrong because fine-tuning on the same data that caused hallucinations may not fix the issue. Option D is wrong because increasing max tokens may allow more hallucinated content. Option E is wrong because using a larger model often increases hallucination risk due to more parameters.

Full explanation →

392

MCQeasy

A company wants to automatically summarize customer support tickets into a short paragraph. Which AWS service is MOST appropriate for this task?

A.Amazon Bedrock

B.Amazon Rekognition

C.Amazon Polly

D.Amazon Comprehend

AnswerA

Amazon Bedrock provides access to foundation models that can summarize text.

Why this answer

Amazon Bedrock provides access to foundation models that can perform summarization. Option C is correct because Bedrock is a managed service offering pre-trained models for tasks like text summarization. Option A (Amazon Comprehend) is for NLP tasks like entity extraction, not summarization.

Option B (Amazon Rekognition) is for image/video analysis. Option D (Amazon Polly) is text-to-speech.

Full explanation →

393

MCQeasy

A retail company uses a recommendation system that occasionally suggests inappropriate products to minors. Which responsible AI practice should be applied?

A.Implement human review of flagged recommendations

B.Rely solely on user feedback to improve

C.Disable the recommendation system entirely

D.Increase the volume of training data

AnswerA

Human-in-the-loop ensures responsible oversight.

Why this answer

The correct practice is to implement human review of flagged recommendations. This aligns with the responsible AI principle of accountability, where automated systems must have oversight mechanisms to catch and correct inappropriate outputs, especially when minors are involved. Human-in-the-loop (HITL) validation ensures that edge cases or subtle context (e.g., age-inappropriate product suggestions) are caught before they reach end users, rather than relying solely on automated filters or feedback loops.

Exam trap

AWS often tests the misconception that more data or automation alone can solve fairness and safety issues, when in fact responsible AI requires explicit governance mechanisms like human oversight for high-stakes or vulnerable-user scenarios.

How to eliminate wrong answers

Option B is wrong because relying solely on user feedback to improve is reactive and can expose minors to harm before any corrective action is taken; feedback loops are slow and may not capture subtle or rare inappropriate recommendations. Option C is wrong because disabling the recommendation system entirely is an extreme, non-scalable response that eliminates business value and does not teach the system to behave responsibly; responsible AI aims to mitigate harm, not abandon functionality. Option D is wrong because increasing the volume of training data does not inherently address the problem of inappropriate recommendations; if the training data itself contains biased or unlabeled age-sensitive content, more data can amplify the issue rather than fix it.

Full explanation →

394

MCQmedium

Refer to the exhibit. A user invoked a Claude model using provisioned throughput and received a ThrottlingException. Which is the most likely cause?

A.The model is not available in the region

B.The provisioned throughput request per minute limit was exceeded

C.The prompt was too long

D.The inference type should be ON_DEMAND

AnswerB

Throttling occurs when the request rate exceeds the allowed limit for the provisioned throughput.

Why this answer

Option A is correct. Provisioned throughput has a requests-per-minute limit, and exceeding it causes a ThrottlingException. Option B would produce a different error.

Option C would be a validation error, not throttling. Option D is not the cause because PROVISIONED is valid.

Full explanation →

395

MCQmedium

A company uses Amazon SageMaker to train a model. The training job fails with 'InsufficientInstanceCapacity' error. What is the most likely cause?

A.The request rate is too high.

B.The dataset size exceeds the instance storage limit.

C.The requested instance type is not available in the specified region.

D.The training image is not compatible with the instance type.

AnswerC

This error occurs when AWS cannot provision the instance due to capacity constraints.

Why this answer

The 'InsufficientInstanceCapacity' error in Amazon SageMaker indicates that AWS does not currently have enough available capacity for the requested instance type in the specified region or Availability Zone. This is a common transient error when demand for a particular instance type exceeds supply, and it is not related to request rate, dataset size, or image compatibility.

Exam trap

Cisco often tests the distinction between capacity errors and throttling errors, so the trap here is confusing 'InsufficientInstanceCapacity' with a rate-limiting or quota error, leading candidates to incorrectly select Option A.

How to eliminate wrong answers

Option A is wrong because 'InsufficientInstanceCapacity' is a capacity error, not a throttling error; throttling (e.g., from high request rate) would return a 'ThrottlingException' or 'RequestLimitExceeded' error. Option B is wrong because dataset size exceeding instance storage limits would cause an 'OutOfMemory' or 'DiskFull' error, not a capacity error. Option D is wrong because image compatibility issues would result in a 'ClientError' or 'ImageNotFoundException', not an instance capacity error.

Full explanation →

396

MCQhard

A financial services company uses a machine learning model to approve loan applications. The model is a gradient boosting classifier trained on historical loan data. Recently, the company noticed that the model's approval rate for applicants from a certain demographic group is significantly lower than for other groups, even though the model's overall accuracy remains high. The data science team has been asked to address this potential bias while minimizing the impact on overall model performance. The team has access to the training data and the trained model. They have limited time and budget. Which course of action should the team take first?

A.Remove the sensitive attribute from the training data and retrain the model.

B.Collect more data from the under-represented demographic group and retrain the model.

C.Analyze the training data for bias and retrain the model using bias mitigation techniques such as reweighting.

D.Adjust the model's decision threshold for the affected group after deployment.

AnswerC

This directly addresses the root cause and is resource-efficient.

Why this answer

The most efficient first step is to analyze the training data for bias and then retrain the model with bias mitigation techniques like reweighting. Option A is wrong because collecting more data is resource-intensive and may not address bias. Option C is wrong because feature engineering may not help if the bias is in the labels.

Option D is wrong because post-hoc adjustments can introduce other issues and may not be as effective as addressing bias during training.

Full explanation →

397

Multi-Selecthard

A company is training a deep learning model for image classification. Which THREE practices help reduce overfitting? (Choose three.)

Select 3 answers

A.L2 regularization

B.Increasing model depth

C.Increasing learning rate

D.Dropout

E.Data augmentation

AnswersA, D, E

L2 regularization penalizes large weights, reducing overfitting.

Why this answer

L2 regularization (also known as weight decay) adds a penalty proportional to the square of the weight magnitudes to the loss function. This discourages the model from learning overly complex patterns by forcing weights to stay small, which reduces overfitting by limiting the model's capacity to fit noise in the training data.

Exam trap

Cisco often tests the misconception that increasing model complexity (depth) or tuning the learning rate can mitigate overfitting, when in fact these changes either exacerbate the problem or address unrelated training dynamics.

Full explanation →

398

MCQhard

An organization uses SageMaker JumpStart to deploy a foundation model for real-time inference. They observe high latency. What is the most effective way to reduce latency?

A.Compile the model with SageMaker Neo

B.Use a larger instance with more memory

C.Use batch transform instead

D.Enable SageMaker Inference Recommender

AnswerA

Neo compiles models for faster inference on specific hardware.

Why this answer

SageMaker Neo compiles the model to optimize it for the target hardware, reducing inference latency by applying hardware-specific optimizations such as kernel fusion, quantization, and memory layout tuning. This directly addresses the high latency issue for real-time inference without changing the instance type or inference mode.

Exam trap

AWS often tests the misconception that increasing instance size or switching to batch processing is the primary solution for latency, when in fact model compilation with SageMaker Neo is the most direct and cost-effective optimization for real-time inference.

How to eliminate wrong answers

Option B is wrong because using a larger instance with more memory may reduce latency due to increased compute capacity, but it is less effective and more costly than model compilation, which optimizes the model itself for the existing hardware. Option C is wrong because batch transform is designed for offline, asynchronous inference on large datasets, not for real-time inference, and it would not reduce latency for a real-time endpoint. Option D is wrong because SageMaker Inference Recommender helps select the optimal instance type and configuration for a given model, but it does not directly reduce latency; it recommends deployment parameters, whereas compilation actively optimizes the model.

Full explanation →

399

Multi-Selectmedium

Which TWO actions would improve the grounding of responses from a generative AI model using RAG? (Choose 2)

Select 2 answers

A.Fine-tune the model on unrelated data

B.Reduce the context window to save tokens

C.Increase the model's temperature parameter

D.Use RAG with a knowledge base of relevant documents

E.Include source citations in the prompt instructions

AnswersD, E

RAG provides retrieved context, reducing reliance on model's parametric knowledge.

Why this answer

Using RAG with relevant documents directly grounds responses in factual data. Including source citations in prompts encourages the model to base answers on retrieved information. Increasing temperature or reducing context would likely hurt grounding.

Fine-tuning on unrelated data does not help.

Full explanation →

400

Multi-Selecthard

Which THREE are best practices for building a secure and scalable generative AI application using Amazon Bedrock? (Choose 3)

Select 3 answers

A.Implement guardrails to filter harmful content

B.Deploy models on EC2 instances for better control

C.Store API keys in source code for easy access

D.Use AWS KMS to encrypt data and model artifacts

E.Use foundation models from multiple providers via Bedrock

AnswersA, D, E

Guardrails enforce content policies and prevent inappropriate outputs.

Why this answer

Using multiple foundation models through Bedrock's multi-model support allows flexibility and best-of-breed selection. Guardrails provide content safety. KMS encryption protects data at rest and in transit.

Storing keys in source code is insecure. EC2 deployment is not applicable to Bedrock's serverless model.

Full explanation →

401

Multi-Selecthard

Which THREE are benefits of using Amazon Bedrock over self-managing foundation models on EC2? (Choose THREE.)

Select 3 answers

A.Built-in integration with AWS services such as AWS CloudWatch and AWS CloudTrail.

B.Lower data transfer costs between cloud regions.

C.Access to a curated set of foundation models from different providers.

D.Managed infrastructure for model hosting and scaling.

E.Greater control over model fine-tuning and customization.

AnswersA, C, D

Bedrock natively logs to CloudWatch and CloudTrail for monitoring and auditing.

Why this answer

Option A is correct because Amazon Bedrock provides built-in integration with AWS services like CloudWatch for monitoring model invocation metrics and CloudTrail for auditing API calls. This eliminates the need to manually set up logging and monitoring infrastructure when self-managing foundation models on EC2, where you would have to configure these integrations yourself.

Exam trap

The trap here is that candidates may confuse 'managed infrastructure' with 'greater control'—Bedrock simplifies operations but reduces customization flexibility, so option E is a common distractor for those who think managed services offer more control than self-managed solutions.

Full explanation →

402

MCQhard

A company fine-tunes a foundation model using SageMaker to create a domain-specific chatbot. After deployment on Bedrock, the model shows high confidence in incorrect answers. What is the most likely cause and its solution?

A.The model was not pre-trained on enough data; use a larger base model

B.The training data was imbalanced; collect more diverse data

C.The model is overfitting; apply regularization techniques during fine-tuning

D.The inference temperature is too low; increase it

AnswerC

Overfitting leads to overconfidence on training patterns. Regularization helps generalize better.

Why this answer

Overfitting during fine-tuning can cause the model to be overly confident even when wrong. Regularization (e.g., early stopping, dropout) reduces overconfidence.

Full explanation →

403

MCQmedium

A financial services company is deploying a foundation model to analyze customer sentiment from call transcripts. The model outputs must be consistent and deterministic for auditing purposes. Which parameter configuration should the company use?

A.Set temperature to 0.1 and top_p to 0.9.

B.Set temperature to 0.7 and top_p to 1.0.

C.Set temperature to 0.5 and top_p to 0.5.

D.Set temperature to 0 and top_p to 1.

AnswerD

Temperature 0 makes the model deterministic.

Why this answer

Setting temperature to 0 and top_p to 1 forces the model to always select the highest-probability token at each step, producing deterministic and repeatable outputs. This is essential for auditing and compliance in financial services, where consistency is required. Any nonzero temperature introduces randomness, which undermines determinism.

Exam trap

AWS often tests the misconception that low temperature (e.g., 0.1) is 'deterministic enough,' but only temperature exactly 0 guarantees deterministic outputs, and top_p must be 1 to avoid interfering with the argmax selection.

How to eliminate wrong answers

Option A is wrong because temperature 0.1 still introduces slight randomness, making outputs non-deterministic and unsuitable for auditing. Option B is wrong because temperature 0.7 introduces significant randomness, and top_p 1.0 does not constrain it, leading to high variability. Option C is wrong because temperature 0.5 introduces randomness, and top_p 0.5 further restricts token sampling but does not eliminate the stochastic behavior from the nonzero temperature.

Full explanation →

404

MCQmedium

A team has built a regression model to predict house prices. The RMSE is 50,000 on the test set. Which action is most appropriate to improve model performance?

A.Remove outliers from training data

B.Apply feature scaling

C.Add more relevant features

D.Use a different evaluation metric

AnswerC

Adding informative features can reduce bias and improve model accuracy.

Why this answer

Option C is correct because adding relevant features can capture more patterns and improve predictive accuracy. Using a different metric (A) does not improve the model. Removing outliers (B) may help if outliers exist, but adding features is generally a more systematic improvement.

Feature scaling (D) helps some algorithms but may not be the primary issue.

Full explanation →

405

MCQeasy

A developer needs to preprocess a dataset consisting of customer reviews for sentiment analysis. Which text preprocessing technique is most likely to improve model accuracy?

A.Stemming

B.All of the above

C.Removing stop words

D.Lowercasing

AnswerB

Combining lowercasing, stop word removal, and stemming is a common and effective preprocessing pipeline.

Why this answer

Option B is correct because all three listed techniques—stemming, removing stop words, and lowercasing—are standard text preprocessing steps that collectively improve model accuracy for sentiment analysis. Stemming reduces words to root forms to consolidate similar meanings, removing stop words eliminates noise from high-frequency but low-information tokens, and lowercasing normalizes case variations. Together, they reduce the feature space and help the model focus on sentiment-bearing terms, leading to better generalization and accuracy.

Exam trap

Cisco often tests the misconception that a single preprocessing step is sufficient, when in fact the combination of all three—stemming, stop word removal, and lowercasing—is standard practice for maximizing model accuracy in NLP tasks like sentiment analysis.

How to eliminate wrong answers

Option A is wrong because stemming alone is insufficient; while it helps consolidate word variants, it does not address noise from stop words or case sensitivity, so it is not the single most likely technique to improve accuracy. Option C is wrong because removing stop words alone reduces noise but ignores the benefits of stemming and lowercasing, which are also critical for handling morphological variations and case mismatches. Option D is wrong because lowercasing alone normalizes case but does not handle word root consolidation or removal of irrelevant high-frequency words, leaving significant noise in the feature set.

Full explanation →

406

Multi-Selectmedium

A company is using Amazon SageMaker to train machine learning models. The security team wants to ensure that the training data is encrypted at rest and that the SageMaker notebook instances cannot access the internet. Which TWO actions should the company take? (Choose TWO.)

Select 2 answers

A.Enable S3 server-side encryption with AWS KMS (SSE-KMS) for the training data bucket

B.Create an AWS CloudTrail trail to log all S3 data events

C.Enable encryption at rest for the SageMaker endpoint using the AWS Management Console

D.Disable internet access for the SageMaker notebook instance by placing it in a VPC without a NAT gateway or internet gateway

E.Use AWS Security Token Service (STS) to generate temporary credentials for the notebook instance

AnswersA, D

SSE-KMS encrypts objects at rest using KMS keys.

Why this answer

Option A is correct because enabling S3 server-side encryption with AWS KMS (SSE-KMS) ensures that the training data stored in the S3 bucket is encrypted at rest. This satisfies the security team's requirement for data encryption at rest, as SSE-KMS provides envelope encryption with a customer-managed or AWS-managed KMS key, giving the company control over the encryption keys and auditability via AWS CloudTrail.

Exam trap

The trap here is that candidates often confuse encryption at rest for the endpoint (Option C) with encryption of the training data in S3, or they mistakenly think that CloudTrail logging (Option B) or STS credentials (Option E) provide encryption, when in fact they address auditing and access control, not data encryption.

Full explanation →

407

MCQhard

Refer to the exhibit. A team is creating an IAM policy for a SageMaker notebook user. The user needs to access training data in an S3 bucket and create models. Which responsible AI concern is most relevant to this policy?

A.The policy does not enforce encryption for the notebook.

B.The policy does not restrict which S3 buckets the user can read.

C.The policy does not include a condition for model explainability.

D.The policy grants overly broad permissions, violating the principle of least privilege.

AnswerD

Allowing CreateModel and CreateNotebookInstance on all resources can lead to misuse.

Why this answer

Option C is correct. The policy grants broad access (sagemaker:CreateModel and sagemaker:CreateNotebookInstance on all resources) without restrictions. This could allow a user to create models using any data or expose the notebook.

The least privilege principle is violated, leading to potential unintended model creation or data exposure. Options A and B are less directly related; D is about explainability.

Full explanation →

408

MCQmedium

A company uses an AI system to screen job applications. The system was trained on resumes from previous hires, which predominantly came from a specific demographic. As a result, the system may unfairly filter out qualified candidates from other backgrounds. Which responsible AI practice should the company implement?

A.Implement bias detection metrics and monitor outcomes by demographic groups

B.Focus solely on improving the model's precision and recall

C.Defer all screening decisions to a human recruiter

D.Increase the size of the training dataset without regard to demographic composition

AnswerA

Bias detection and monitoring help identify and correct unfair outcomes.

Why this answer

To mitigate bias, the company should measure and monitor the system's impact across demographic groups. This aligns with fairness metrics. Using more data without addressing bias may not help.

Relying on human review is good but does not guarantee systematic fairness. Focusing only on performance ignores fairness.

Full explanation →

409

MCQeasy

A company wants to predict customer churn. They have historical data with features like usage minutes, support tickets, contract length. The target is binary: churn/not churn. Which ML algorithm is best suited?

A.Logistic regression

B.Principal Component Analysis (PCA)

C.Linear regression

D.K-means clustering

AnswerA

Logistic regression models the probability of a binary outcome using a logistic function.

Why this answer

Logistic regression is the best choice because it is specifically designed for binary classification tasks like predicting churn (churn/not churn). It models the probability of the target class using a logistic (sigmoid) function, making it interpretable and efficient for this type of supervised learning problem with a categorical outcome.

Exam trap

Cisco often tests the distinction between supervised and unsupervised learning, and the trap here is that candidates may confuse dimensionality reduction (PCA) or clustering (K-means) with classification, or mistakenly apply linear regression to a binary outcome without recognizing the need for a logistic function.

How to eliminate wrong answers

Option B is wrong because Principal Component Analysis (PCA) is an unsupervised dimensionality reduction technique, not a classification algorithm; it reduces feature space but does not predict a binary target. Option C is wrong because linear regression predicts a continuous numeric output, not a binary class; using it for classification would violate the assumption of normally distributed errors and produce unbounded predictions. Option D is wrong because K-means clustering is an unsupervised learning algorithm used for grouping unlabeled data into clusters, not for predicting a known binary target variable.

Full explanation →

410

MCQeasy

A financial services company uses Amazon Rekognition to verify customer identities. To ensure responsible AI practices, which measure should the company prioritize?

A.Use only black-box models to protect intellectual property

B.Increase model complexity to improve accuracy

C.Minimize the amount of training data collected

D.Regularly audit the model for demographic bias

AnswerD

Bias audits are essential for fairness.

Why this answer

Option D is correct because regularly auditing the model for demographic bias is a core responsible AI practice, especially for identity verification systems where biased outcomes could lead to unfair treatment of certain customer groups. Amazon Rekognition's facial analysis and comparison features must be tested across diverse demographics to ensure equitable performance, as bias can arise from imbalanced training data or algorithmic artifacts.

Exam trap

The trap here is that candidates may confuse 'responsible AI' with generic model optimization (like increasing accuracy or reducing data), but the exam specifically tests the principle of fairness through bias auditing and transparency.

How to eliminate wrong answers

Option A is wrong because using only black-box models contradicts responsible AI principles; explainability and transparency are critical for auditing bias and ensuring fairness, and black-box models obscure how decisions are made, making it harder to detect issues. Option B is wrong because increasing model complexity does not inherently improve accuracy and can amplify bias or reduce interpretability; responsible AI prioritizes balanced performance and fairness over raw accuracy. Option C is wrong because minimizing training data can exacerbate bias by underrepresenting certain demographic groups, leading to poor generalization and unfair outcomes; responsible AI requires diverse, representative datasets.

Full explanation →

411

MCQhard

Refer to the exhibit. A developer sees this error when calling Amazon Bedrock for inference. What is the MOST likely cause and recommended solution?

A.The model ID is incorrect; use a different model

B.The prompt is too long; reduce the number of tokens in the prompt

C.The request rate exceeds the model's throughput limit; implement retries with exponential backoff

D.Increase the max_tokens_to_sample value

AnswerC

Throttling is due to rate limits; exponential backoff handles it.

Why this answer

Option A is correct. The error indicates throttling (rate exceeded). Retries with exponential backoff handle transient throttling.

Option B (fix prompt) is unrelated. Option C (change model) not needed. Option D (increase max_tokens) could exacerbate the issue.

Full explanation →

412

Multi-Selecthard

Which TWO of the following are key components of a responsible AI governance framework?

Select 2 answers

A.Develop and enforce AI ethics policies and standards

B.Focus solely on compliance with legal regulations

C.Minimize human involvement in AI lifecycle decisions

D.Conduct regular bias and fairness impact assessments

E.Deploy AI models as black boxes to avoid scrutiny

AnswersA, D

Policies provide the foundation for governance.

Why this answer

Option A is correct because a responsible AI governance framework must include the development and enforcement of AI ethics policies and standards to ensure alignment with societal values, fairness, and accountability. These policies guide the design, deployment, and monitoring of AI systems, embedding ethical principles such as transparency, privacy, and non-discrimination into the AI lifecycle. Without such policies, organizations risk deploying AI that violates ethical norms or regulatory expectations.

Exam trap

Cisco often tests the distinction between mere legal compliance and comprehensive ethical governance, trapping candidates who think that meeting regulatory requirements alone constitutes responsible AI, while ignoring proactive fairness and transparency measures.

Full explanation →

413

MCQeasy

A developer is using Amazon Bedrock's Claude model to summarize long documents. The developer notices that the summaries sometimes miss key points. Which parameter adjustment is most likely to improve summary completeness?

A.Increase the max_tokens parameter.

B.Increase the top_k parameter.

C.Increase the temperature parameter.

D.Increase the top_p parameter.

AnswerA

More tokens allow the model to include more details in the summary.

Why this answer

Increasing max_tokens allows the model to generate longer outputs, which is essential when summarizing long documents because the summary may need more tokens to capture all key points. If max_tokens is too low, the model truncates the response, potentially omitting important details. This directly addresses the issue of missing key points by providing sufficient output length for a complete summary.

Exam trap

Cisco often tests the misconception that parameters controlling randomness (temperature, top_k, top_p) affect output length or completeness, when in fact they only influence token selection diversity and creativity.

How to eliminate wrong answers

Option B is wrong because increasing top_k controls the number of highest-probability tokens considered during sampling, which affects randomness and diversity, not the length or completeness of the output. Option C is wrong because increasing temperature increases randomness in token selection, which can lead to more creative but less focused summaries, potentially worsening completeness. Option D is wrong because increasing top_p (nucleus sampling) also controls randomness by selecting tokens with cumulative probability, and does not extend the output length or guarantee inclusion of key points.

Full explanation →

414

MCQeasy

A company is building a chatbot to answer customer queries using Amazon Lex. The development team has created a large dataset of customer interactions and intends to use Amazon SageMaker to train a custom machine learning model for natural language understanding (NLU). The team wants to integrate the trained model with Amazon Lex to handle intents and slots. The team has limited experience with SageMaker and wants to minimize operational overhead. Which solution should the team use?

A.Encapsulate the custom model in a Docker container, push it to Amazon ECR, and create a custom machine learning resource in Amazon Lex to invoke the container directly.

B.Train a custom model in SageMaker using a built-in algorithm like BlazingText, then deploy it to a SageMaker endpoint and integrate with Lex via a AWS Lambda function that calls the endpoint.

C.Use Amazon Comprehend to perform sentiment analysis and entity recognition, then map the results to Lex intents using Lambda.

D.Use SageMaker Autopilot to automatically build and train the best model, then deploy to a SageMaker endpoint and use Lambda to invoke it for Lex integration.

AnswerD

SageMaker Autopilot automates the machine learning process, minimizing manual effort, and the trained model can be deployed to an endpoint and integrated with Lex via Lambda.

Why this answer

Option D is correct because SageMaker Autopilot automates model building, tuning, and deployment, reducing the need for manual intervention and expertise. Option A requires manual algorithm selection and tuning. Option B uses Amazon Comprehend, which provides general-purpose NLP but does not allow for custom NLU model training.

Option C is not supported because Amazon Lex does not directly invoke custom Docker containers; integration is typically done via Lambda.

Full explanation →

415

MCQhard

A research team is using Amazon Bedrock to analyze scientific papers. They want the model to generate answers based only on papers published after 2023. Which approach should they use?

A.Fine-tune the model on a dataset of post-2023 papers and deploy it.

B.Set the maxTokens to a low value to force the model to rely on recent context.

C.Include a system prompt instructing the model to ignore data before 2023.

D.Use Amazon Bedrock Knowledge Bases with a metadata filter to retrieve only papers published after 2023, and generate responses based on retrieved content.

AnswerD

Metadata filtering ensures only relevant recent documents are used, grounding the model in current data.

Why this answer

Option D is correct because Amazon Bedrock Knowledge Bases with a metadata filter allows you to restrict retrieval to only documents that match specific metadata criteria, such as publication year. By filtering the vector search to only include papers published after 2023, the model generates responses based solely on that retrieved content, ensuring it does not rely on pre-2023 data. This approach is the only one that guarantees the model's answers are grounded exclusively in the specified time range.

Exam trap

AWS often tests the misconception that a system prompt or fine-tuning can reliably restrict a model's knowledge to a specific time period, when in fact only a retrieval-based approach with metadata filtering can enforce such temporal constraints.

How to eliminate wrong answers

Option A is wrong because fine-tuning the model on a dataset of post-2023 papers does not prevent the model from using its pre-existing training data (which includes pre-2023 knowledge) during inference; fine-tuning adjusts weights but does not erase prior knowledge, so the model could still generate answers based on older information. Option B is wrong because setting maxTokens to a low value limits the length of the generated response but does not control the temporal scope of the model's knowledge; the model can still draw on pre-2023 training data regardless of token count. Option C is wrong because a system prompt instructing the model to ignore data before 2023 is merely a suggestion and not a technical enforcement; the model has no inherent mechanism to filter its own training data by date, so it may still generate answers based on pre-2023 information, especially if the prompt is not strictly followed.

Full explanation →

416

Multi-Selecthard

A company is deploying a customer service chatbot using a large language model (LLM) via Amazon Bedrock. The application must meet high accuracy for domain-specific queries, low latency, and be cost-effective. Which TWO strategies should the company adopt to achieve these goals? (Choose two.)

Select 2 answers

A.Store user prompts in a shared cache to reuse common queries.

B.Fine-tune the model on a large corpus of customer service transcripts to improve domain knowledge.

C.Use a Retrieval-Augmented Generation (RAG) architecture with a vector database for domain context.

D.Select a smaller, faster model that trades some accuracy for throughput.

E.Increase the model's maximum token limit to handle longer customer queries.

AnswersA, C

Caching frequent queries reduces latency and cost by avoiding repeated model invocations.

Why this answer

Retrieval-Augmented Generation (RAG) provides domain-specific context without full fine-tuning, reducing cost and latency. Caching responses for common queries reduces latency. Option A is not necessarily cost-effective; fine-tuning is expensive and may be overkill.

Option B is not good practice; it reduces security. Option D is overkill for latency; model choice should be driven by capability, not just throughput.

Full explanation →

417

Multi-Selecthard

An organization is evaluating different foundation models (FMs) on Amazon Bedrock for a legal document analysis task. Which THREE factors should they consider when selecting a model? (Choose 3.)

Select 3 answers

A.The region where the model is hosted

B.Model size (number of parameters)

C.Cost per inference call

D.Support for the specific language of the documents

E.Token limits for input and output

AnswersB, D, E

Larger models often have better understanding but higher cost.

Why this answer

Options A, B, and D are correct. Model size affects capability and cost, token limits determine the length of documents that can be processed, and language support is critical for legal documents. Option C (region) is not a model capability factor.

Option E (cost per inference) is operational but not a primary technical selection factor for this task.

Full explanation →

418

MCQeasy

Which AWS service provides a serverless API for accessing foundation models with per-token pricing?

A.Amazon Bedrock

B.Amazon API Gateway

C.AWS Lambda

D.Amazon SageMaker

AnswerA

Bedrock provides a serverless API with per-token pricing.

Why this answer

Amazon Bedrock is the managed service that offers a serverless API for foundation models, charging per token.

Full explanation →

419

MCQmedium

An IAM policy allows creation of SageMaker training jobs only if they use a specific VPC security group. A user tries to create a training job without specifying that security group. What will happen?

A.The request will succeed but SageMaker will ignore the condition

B.The request will succeed because the condition is optional

C.The request will be denied because the training job resource ARN is invalid

D.The request will be denied with an AccessDenied error

AnswerD

The IAM condition is not satisfied, so the request is denied.

Why this answer

Option D is correct because IAM policies are evaluated before any AWS API action is executed. If the policy includes a condition that requires a specific VPC security group for SageMaker training jobs, and the user's request does not include that security group, the condition is not met, resulting in an explicit deny (AccessDenied error). AWS IAM denies the request by default if the condition in a policy is not satisfied, regardless of whether the condition is marked as optional in the API.

Exam trap

The trap here is that candidates assume an optional API parameter means the IAM condition is also optional, but IAM conditions are strictly enforced regardless of whether the parameter is required by the API.

How to eliminate wrong answers

Option A is wrong because IAM policies do not ignore conditions; if a condition is not met, the request is denied, not silently ignored. Option B is wrong because the condition is not optional from an IAM perspective; even if the API parameter is optional, the IAM policy condition must be satisfied for the request to be allowed. Option C is wrong because the training job resource ARN is not invalid; the request is denied due to the policy condition, not due to an ARN format issue.

Full explanation →

420

MCQeasy

A data science team is fine-tuning a Llama 2 7B model on Amazon SageMaker for a text classification task. After the first training run, they notice the loss is not decreasing and the model is overfitting to the small training set. What should the team change to mitigate overfitting?

A.Add dropout layers and reduce the learning rate.

B.Increase the number of epochs to allow the model to learn more patterns.

C.Increase the batch size and use gradient accumulation.

D.Remove dropout layers from the model architecture.

AnswerA

Dropout randomly drops neurons to prevent co-adaptation, and a lower learning rate helps stabilize training, both reducing overfitting.

Why this answer

Option D is correct because increasing dropout and reducing learning rate are standard regularization techniques. Option A is wrong because increasing batch size can slightly regularize but often insufficiently. Option B is wrong because increasing epochs typically worsens overfitting.

Option C is wrong because removing dropout reduces regularization, worsening overfitting.

Full explanation →

421

Multi-Selecteasy

Which TWO AWS services can be used to monitor and detect security anomalies in Amazon SageMaker model inference data? (Choose TWO.)

Select 2 answers

A.Amazon Macie

B.AWS CloudTrail

C.Amazon CodeGuru Security

D.Amazon SageMaker Model Monitor

E.Amazon CloudWatch Logs

AnswersD, E

Model Monitor detects data drift and anomalies in inference data.

Why this answer

Amazon SageMaker Model Monitor is specifically designed to detect deviations in model quality, such as data drift and feature attribution drift, by continuously monitoring inference data against a baseline. Amazon CloudWatch Logs can be used to capture and analyze inference request logs, enabling custom anomaly detection through log-based metrics and alarms. Together, they provide a comprehensive approach to monitoring security anomalies in SageMaker model inference data.

Exam trap

The trap here is that candidates often confuse AWS CloudTrail (API auditing) with CloudWatch Logs (log monitoring), or assume Macie can monitor any data flow, when it is restricted to S3 object-level sensitive data discovery.

Full explanation →

422

MCQeasy

Refer to the exhibit. A developer is reviewing CloudWatch Logs for a deployed model and notices the same input appears multiple times with slightly different probabilities. What responsible AI concern does this pattern suggest?

A.The model is overfitting to the training data.

B.The model is not robust; it produces inconsistent predictions for the same input.

C.The model is exhibiting bias against a demographic group.

D.The input data is drifting from the training distribution.

AnswerB

Identical inputs should yield identical outputs; variation indicates instability.

Why this answer

Option B is correct. Repeated identical inputs with different predictions indicate model instability (lack of robustness). This could be due to randomness in the model or adversarial conditions.

Option A is irrelevant; C is possible but not the primary concern; D is about data drift, but input is same.

Full explanation →

423

MCQhard

A healthcare company uses Amazon SageMaker to train a model that predicts patient readmission risk based on electronic health records (EHRs) stored in Amazon HealthLake. The training dataset contains 2 million records from the past three years, with a significant gender imbalance: 70% male and 30% female. The model achieved high overall accuracy, but further analysis using SageMaker Clarify revealed that the precision for female patients is 0.65 while for male patients it is 0.88. Additionally, the model's false positive rate for female patients is significantly higher. The company must comply with healthcare regulations that require fairness and non-discrimination. The data science team has already used SageMaker Data Wrangler for initial preprocessing and SageMaker Clarify for bias detection. They need to take immediate action to mitigate the bias before deploying to production. Which course of action should the team take?

A.Use SageMaker Clarify's bias mitigation feature to apply reweighing techniques and retrain the model with adjusted sample weights.

B.Use SageMaker Clarify to generate SHAP values and adjust the model's feature importance by removing biased features.

C.Use SMOTE (Synthetic Minority Oversampling Technique) to balance the training dataset before retraining.

D.Use SageMaker Model Monitor to detect feature drift and automatically retrain the model with updated data.

AnswerA

This directly mitigates bias by reweighting training samples to reduce disparity.

Why this answer

The correct answer is to use SageMaker Clarify's built-in bias mitigation technique (reweighing) as it directly addresses the disparity by adjusting sample weights during training. Option A: Model Monitor is for monitoring drift, not mitigation. Option B: SHAP values explain predictions but do not change model behavior.

Option C: SMOTE addresses class imbalance but not fairness in terms of group accuracy disparity; it may even worsen bias. Therefore, D is the best choice.

Full explanation →

424

MCQmedium

A company uses Amazon Bedrock to generate marketing copy. They want to measure the quality of generated text compared to reference text. Which metric is most appropriate?

A.F1 score

B.BLEU

C.RMSE

D.Accuracy

AnswerB

BLEU calculates n-gram overlap between candidate and reference text, suitable for generation evaluation.

Why this answer

BLEU (Bilingual Evaluation Understudy) is the most appropriate metric for evaluating the quality of generated text against reference text in tasks like machine translation and text generation. It measures n-gram precision between the generated and reference texts, making it ideal for assessing marketing copy generated by Amazon Bedrock.

Exam trap

AWS often tests the distinction between classification/regression metrics and text generation metrics, leading candidates to mistakenly apply F1 score or accuracy to evaluate generated text quality instead of using BLEU or similar sequence-based metrics.

How to eliminate wrong answers

Option A is wrong because F1 score is a classification metric that measures harmonic mean of precision and recall, not suitable for evaluating text generation quality against reference text. Option C is wrong because RMSE (Root Mean Square Error) is a regression metric used for continuous numerical predictions, not for text or sequence evaluation. Option D is wrong because Accuracy is a classification metric that measures the proportion of correct predictions, which does not account for the sequential and linguistic nuances of generated text.

Full explanation →

425

MCQmedium

A company is developing a chatbot using Amazon Bedrock and wants to ensure the model's responses do not include toxic or biased language. The company has a labeled dataset of undesirable responses. Which approach should be used to fine-tune the foundation model to reduce harmful outputs?

A.Use reinforcement learning from human feedback (RLHF) with a reward model trained on human preferences.

B.Perform supervised fine-tuning on a curated dataset of safe responses.

C.Use prompt engineering to instruct the model to avoid toxic language.

D.Implement adversarial validation by testing against toxic inputs.

AnswerA

RLHF uses human feedback to train a reward model, which then guides the base model to generate safer outputs.

Why this answer

Reinforcement learning from human feedback (RLHF) is the correct approach because it directly optimizes the model to avoid toxic or biased outputs by training a reward model on human-labeled preferences. The reward model scores the model's responses, and the foundation model is fine-tuned via reinforcement learning to maximize these scores, effectively reducing harmful language. This method is specifically designed to align model behavior with nuanced human values, such as avoiding toxicity, which supervised fine-tuning alone cannot guarantee.

Exam trap

Cisco often tests the misconception that supervised fine-tuning or prompt engineering alone can reliably eliminate harmful outputs, when in fact RLHF is required to align the model with nuanced human preferences through iterative feedback.

How to eliminate wrong answers

Option B is wrong because supervised fine-tuning on a curated dataset of safe responses teaches the model to mimic safe patterns but does not explicitly penalize toxic outputs during generation; it lacks a reward signal to discourage harmful language when the model deviates from the training distribution. Option C is wrong because prompt engineering is a static, instruction-based technique that can be easily bypassed by adversarial inputs or subtle variations in phrasing; it does not modify the model's internal weights to reliably avoid toxic language. Option D is wrong because adversarial validation only tests the model's robustness to toxic inputs without fine-tuning the model itself; it identifies vulnerabilities but does not reduce harmful outputs in production.

Full explanation →

426

Multi-Selecthard

A company wants to evaluate the performance of a generative AI model before deployment. Which TWO metrics are most relevant for measuring model quality? (Select two.)

Select 2 answers

A.BLEU score

B.Response time

C.Perplexity

D.Model size

E.CPU utilization

AnswersA, C

BLEU evaluates the quality of generated text by comparing n-grams with reference translations.

Why this answer

Options A (BLEU score) and C (Perplexity) are standard for evaluating text generation quality. BLEU measures similarity to reference text, and perplexity measures how well the model predicts a sample. Option B (CPU utilization) is operational, not quality.

Option D (Response time) is latency. Option E (Model size) is a design parameter.

Full explanation →

427

MCQhard

A bank uses an AI system to detect fraudulent transactions. The model has high precision but low recall for small transactions, potentially missing fraud. Which approach aligns with responsible AI?

A.Send all flagged transactions to customers for confirmation

B.Focus only on precision to minimize false positives

C.Tune the model to achieve an acceptable balance between recall and precision

D.Increase the detection threshold to reduce false positives

AnswerC

Balancing metrics is a responsible approach.

Why this answer

Option C is correct because responsible AI requires balancing competing objectives like precision and recall to align with ethical principles and business needs. In fraud detection, high precision with low recall means many fraudulent transactions are missed, which can lead to significant financial losses and erode customer trust. Tuning the model to achieve an acceptable trade-off ensures that the system is both effective and fair, minimizing harm while maintaining operational viability.

Exam trap

Cisco often tests the misconception that increasing the detection threshold improves model performance overall, when in fact it only reduces false positives at the cost of lowering recall, which can be detrimental in high-stakes applications like fraud detection.

How to eliminate wrong answers

Option A is wrong because sending all flagged transactions to customers for confirmation shifts the burden to users, degrades user experience, and may not be scalable or timely for real-time fraud detection, nor does it address the underlying model imbalance. Option B is wrong because focusing only on precision ignores the critical need to catch actual fraud (recall), which can result in substantial financial losses and violates the responsible AI principle of beneficence. Option D is wrong because increasing the detection threshold reduces false positives but further lowers recall, worsening the problem of missed fraud and contradicting the goal of responsible AI.

Full explanation →

428

MCQeasy

A company wants to use a foundation model to automatically summarize lengthy documents. Which capability of foundation models is being utilized?

A.Text generation

B.Sentiment analysis

C.Text classification

D.Machine translation

AnswerA

Summarization is a form of text generation where the model produces concise output.

Why this answer

Summarization is a text generation task where the model produces a concise version of the original content. Foundation models (e.g., GPT, Claude) are pre-trained on vast corpora and can generate coherent summaries by predicting the next tokens conditioned on the input document. This directly utilizes the text generation capability, not classification or translation.

Exam trap

Cisco often tests the distinction between text generation and text classification, so the trap here is that candidates may confuse summarization (a generative task) with classification or analysis tasks, especially when the question emphasizes 'understanding' the document rather than 'producing' new text.

How to eliminate wrong answers

Option B (Sentiment analysis) is wrong because it involves classifying the emotional tone of text (positive, negative, neutral), not generating a summary. Option C (Text classification) is wrong because it assigns predefined labels or categories to text, whereas summarization requires generating new text. Option D (Machine translation) is wrong because it converts text from one language to another, not condensing content within the same language.

Full explanation →

429

MCQmedium

Refer to the exhibit. An AWS CloudTrail log shows the creation of an IAM policy for a SageMaker execution role. Which responsible AI concern does this configuration raise?

A.Insufficient training data

B.Lack of least privilege access control

C.Violation of data residency requirements

D.Absence of model monitoring

AnswerB

The wildcard resource exposes all endpoints to potential misuse.

Why this answer

The policy allows sagemaker:InvokeEndpoint on all resources (*), violating the principle of least privilege. This could allow the role to invoke any SageMaker endpoint, potentially leading to unauthorized inferences. Model monitoring, training data, and data residency are not addressed by this log entry.

Full explanation →

430

MCQmedium

A company uses Amazon Bedrock to generate code snippets for internal tools. They notice that the generated code often contains security vulnerabilities such as SQL injection and cross-site scripting. The security team has compiled a comprehensive list of secure coding guidelines and examples of vulnerable patterns. The development team wants to reduce vulnerabilities without significantly slowing down the code generation process. They have tried adding the guidelines to the system prompt, but the model still produces insecure code occasionally. The team is considering additional measures. Which action should they take to most effectively eliminate security vulnerabilities in the generated code?

A.Implement a post-processing step using Amazon CodeGuru or a similar static analysis tool to scan the generated code for vulnerabilities and reject or fix insecure code.

B.Use a larger, more expensive foundation model that specializes in code generation.

C.Include the complete secure coding guidelines in every prompt.

D.Increase the temperature parameter of the foundation model to promote more diverse outputs.

AnswerA

Correct: Post-processing with static analysis reliably catches vulnerabilities and can be automated without slowing down generation significantly.

Why this answer

Option A is correct because it introduces a deterministic, post-generation validation layer that catches vulnerabilities the model might miss. Amazon CodeGuru Reviewer or similar static analysis tools can scan generated code for patterns like SQL injection and XSS, then reject or fix insecure code without modifying the generation process itself. This approach directly addresses the security team's guidelines while maintaining generation speed, as the model's inference latency is unaffected.

Exam trap

AWS often tests the misconception that prompt engineering alone can fully control model output, when in reality, deterministic post-processing steps are required to enforce strict security or compliance requirements.

How to eliminate wrong answers

Option B is wrong because using a larger, more expensive foundation model does not guarantee elimination of security vulnerabilities; all models can produce insecure code, and size does not correlate with adherence to specific security guidelines. Option C is wrong because including the complete secure coding guidelines in every prompt increases token usage and may cause the model to ignore or truncate the guidelines, leading to inconsistent results and slower generation due to longer prompts. Option D is wrong because increasing the temperature parameter promotes more diverse and random outputs, which would likely increase the probability of generating insecure code rather than reducing it.

Full explanation →

431

MCQeasy

A startup needs to predict customer churn based on historical data containing labels (churned or not). Which type of machine learning should they use?

A.Reinforcement learning

B.Unsupervised learning

C.Supervised learning

D.Semi-supervised learning

AnswerC

Since the data has labels, supervised learning is appropriate for classification.

Why this answer

The startup has labeled historical data (churned or not), which is the defining characteristic of supervised learning. The goal is to learn a mapping from input features to the known output labels to predict churn for new customers. This is a classic classification problem, making supervised learning the correct choice.

Exam trap

Cisco often tests the distinction between supervised and unsupervised learning by presenting a scenario with labeled data, where candidates might mistakenly choose unsupervised learning if they overlook the presence of labels.

How to eliminate wrong answers

Option A is wrong because reinforcement learning involves an agent learning through trial-and-error interactions with an environment to maximize cumulative reward, not from labeled historical data. Option B is wrong because unsupervised learning finds hidden patterns or structures in unlabeled data, but here the labels (churned/not) are explicitly provided. Option D is wrong because semi-supervised learning uses a small amount of labeled data with a large amount of unlabeled data, but the problem states the historical data contains labels, implying fully labeled data is available.

Full explanation →

432

MCQhard

A deployed model on an Amazon SageMaker endpoint is experiencing high inference latency (average 500ms) during peak hours. The model is a deep neural network with 10 million parameters. The endpoint uses a single ml.c5.xlarge instance. The company wants to reduce latency to under 200ms without retraining or changing the model architecture. Which action should they take?

A.Enable automatic scaling to add more instances

B.Switch to a GPU-based instance type like ml.p2.xlarge

C.Deploy the model on a multi-model endpoint

D.Use SageMaker Neo to compile and optimize the model

AnswerD

SageMaker Neo optimizes models for target hardware, significantly reducing inference latency without changing the model.

Why this answer

SageMaker Neo compiles trained models into an optimized format for the target hardware, reducing inference latency without altering the model architecture. For a deep neural network with 10 million parameters on a CPU instance, Neo applies hardware-specific optimizations like operator fusion and memory layout tuning, which can significantly lower latency. This directly addresses the requirement to reduce latency from 500ms to under 200ms without retraining or changing the model.

Exam trap

AWS often tests the misconception that scaling or switching to GPU is the default solution for latency issues, but the trap here is that the question explicitly prohibits retraining or architecture changes, making model compilation via SageMaker Neo the only viable option that directly optimizes inference speed on the existing hardware.

How to eliminate wrong answers

Option A is wrong because automatic scaling adds more instances to handle increased request volume, but it does not reduce per-request latency; it distributes load but each request still processes on a single instance with the same inference time. Option B is wrong because switching to a GPU instance like ml.p2.xlarge may accelerate certain model types but does not guarantee latency reduction for a deep neural network with 10 million parameters, and it introduces higher cost and potential overhead from GPU initialization; the requirement is to reduce latency without retraining or architecture changes, and GPU acceleration often requires model adaptation. Option C is wrong because deploying on a multi-model endpoint is designed to host multiple models on a single endpoint to improve resource utilization, not to reduce inference latency for a single model; it adds container management overhead that could increase latency.

Full explanation →

433

MCQeasy

A company wants to monitor for malicious activity in their machine learning pipelines, such as unauthorized access to training data or model artifacts. Which AWS service can provide automated threat detection and continuous monitoring?

A.AWS Config

B.Amazon GuardDuty

C.AWS Shield

D.Amazon Inspector

AnswerB

GuardDuty continuously monitors for malicious activity across AWS accounts and workloads.

Why this answer

Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior across AWS workloads, including machine learning pipelines. It uses machine learning, anomaly detection, and integrated threat intelligence to identify threats such as unauthorized access to S3 buckets containing training data or model artifacts, without requiring manual intervention.

Exam trap

AWS often tests the distinction between services that monitor for security threats (GuardDuty) versus services that manage compliance (AWS Config), protect against DDoS (AWS Shield), or scan for vulnerabilities (Amazon Inspector), leading candidates to confuse configuration auditing with active threat detection.

How to eliminate wrong answers

Option A is wrong because AWS Config is a service for evaluating and auditing resource configurations against compliance rules, not for continuous threat detection or monitoring for malicious activity. Option C is wrong because AWS Shield is a managed Distributed Denial of Service (DDoS) protection service, designed to safeguard against network and transport layer attacks, not for detecting unauthorized access or malicious behavior in ML pipelines. Option D is wrong because Amazon Inspector is a vulnerability management service that scans for software vulnerabilities and unintended network exposure, not for real-time threat detection or monitoring of malicious activity.

Full explanation →

434

MCQeasy

An e-commerce company uses a foundation model to generate personalized email subject lines. The marketing team notices that the subject lines sometimes contain product recommendations that are out of stock. Which action would best reduce the generation of out-of-stock recommendations without retraining the model?

A.Implement a post-processing step to replace out-of-stock recommendations with in-stock alternatives.

B.Fine-tune the model on a dataset of past successful subject lines that only include in-stock products.

C.Add a system prompt that explicitly instructs the model to only recommend products that are in stock.

D.Use a retrieval-augmented generation (RAG) approach to retrieve a list of in-stock products and include it in the prompt.

AnswerC

A system prompt can constrain the model's output to follow the instruction, reducing unwanted recommendations.

Why this answer

Option C is correct because adding a system prompt that explicitly instructs the model to only recommend in-stock products directly constrains the model's output at inference time without requiring retraining. This leverages the model's instruction-following capability to filter its generated content based on the provided context, which is a lightweight and immediate solution.

Exam trap

AWS often tests the distinction between inference-time interventions (like prompt engineering) and training-time interventions (like fine-tuning), and the trap here is that candidates may confuse RAG (which retrieves external data but does not enforce constraints) with a system prompt that directly instructs the model, leading them to select D instead of C.

How to eliminate wrong answers

Option A is wrong because post-processing replacement of out-of-stock recommendations with in-stock alternatives is reactive and may introduce irrelevant or incorrect substitutions, failing to prevent the model from generating out-of-stock items in the first place. Option B is wrong because fine-tuning the model requires retraining on a new dataset, which contradicts the question's constraint of 'without retraining the model.' Option D is wrong because while RAG can retrieve a list of in-stock products, including it in the prompt does not guarantee the model will exclusively recommend those items; the model may still generate out-of-stock recommendations from its parametric knowledge, especially if the prompt is not strictly enforced.

Full explanation →

435

MCQeasy

A social media company needs to automatically detect and flag toxic comments in multiple languages. They have a large stream of user comments and require real-time moderation. Which AWS service is best suited for this task?

A.Amazon Lex

B.Amazon Comprehend

C.Amazon Rekognition

D.Amazon Translate

AnswerB

Amazon Comprehend provides built-in sentiment analysis and toxic content detection in multiple languages, suitable for real-time text analysis.

Why this answer

Amazon Comprehend is the correct choice because it is a natural language processing (NLP) service that can perform real-time toxicity detection across multiple languages using its built-in content moderation and custom classification capabilities. It analyzes text streams to identify toxic comments (e.g., hate speech, threats) and integrates with AWS streaming services like Amazon Kinesis for real-time processing.

Exam trap

The trap here is that candidates may confuse Amazon Comprehend's NLP capabilities with Amazon Lex's conversational AI or Amazon Translate's language translation, assuming any language-related service can detect toxicity, but only Comprehend provides the specific text analysis APIs for content moderation.

How to eliminate wrong answers

Option A is wrong because Amazon Lex is a service for building conversational interfaces (chatbots) using automatic speech recognition (ASR) and natural language understanding (NLU), not for analyzing text for toxicity. Option C is wrong because Amazon Rekognition is designed for image and video analysis (e.g., object detection, facial recognition), not for processing text comments. Option D is wrong because Amazon Translate is a machine translation service that converts text between languages but does not perform toxicity detection or content moderation.

Full explanation →

436

Multi-Selecteasy

A data science team is building a resume screening model and wants to ensure it does not exhibit gender bias. Which TWO actions are most effective for mitigating bias? (Choose TWO.)

Select 2 answers

A.Apply adversarial debiasing techniques during training.

B.Use a more complex deep learning model.

C.Remove the gender attribute and all correlated features from the dataset.

D.Regularly audit model predictions for disparate impact across genders.

E.Ensure the training dataset has equal numbers of male and female candidates.

AnswersA, D

Adversarial debiasing reduces sensitivity to protected attributes.

Why this answer

Regularly auditing predictions for disparate impact and applying adversarial debiasing are proven techniques. Simply removing attributes may not eliminate bias due to correlated proxies. Balancing datasets is helpful but not sufficient alone.

Complex models do not guarantee fairness.

Full explanation →

437

MCQhard

A financial services company uses a machine learning model to automatically reject credit card transactions suspected of fraud. The model was trained on transaction data from the past two years. Over the last three months, the model's false positive rate has increased significantly, causing legitimate transactions to be declined and leading to customer complaints. The company needs to restore the model's accuracy quickly. Initial analysis shows that the distribution of transaction amounts and locations has shifted compared to the training period. The data science team is under pressure to deploy an update within a week. Which approach should they take to most effectively address the issue while adhering to responsible AI guidelines?

A.Deploy a rule-based system with fixed rules for fraud detection

B.Adjust the decision threshold to reduce false positives without retraining

C.Retrain the model using only the most recent three months of transaction data and evaluate on current distribution

D.Build an ensemble model that combines predictions from the old model and a new model trained on recent data

AnswerC

Retraining on recent data adapts to drift and is straightforward.

Why this answer

The most effective approach is to retrain the model using recent data (last three months) to adapt to the distribution shift, and carefully evaluate for any new biases that may emerge. This directly addresses the drift. Simply adjusting the threshold may not capture new fraud patterns.

Using an ensemble of old and recent models could be complex and may not fully adapt. Deploying a simple rule-based system would be a step backward in capability.

Full explanation →

438

Multi-Selecthard

A data scientist is fine-tuning a foundation model on Amazon Bedrock for a custom summarization task. Which THREE practices should they follow to optimize the fine-tuning process?

Select 3 answers

A.Start with a base model that is already strong in the domain.

B.Use the default hyperparameters without tuning.

C.Use a representative dataset that reflects the target task.

D.Monitor training loss and validation loss to avoid overfitting.

E.Train for as many epochs as possible.

AnswersA, C, D

A good base model reduces training time and improves results.

Why this answer

Starting with a base model that is already strong in the domain (Option A) is correct because it reduces the amount of fine-tuning data and compute required. Amazon Bedrock provides access to various foundation models (e.g., Anthropic Claude, Amazon Titan) that have been pre-trained on diverse corpora; selecting one that is already proficient in the target domain (e.g., legal or medical summarization) means the model's existing knowledge can be adapted with fewer training steps, leading to better performance and lower risk of catastrophic forgetting.

Exam trap

Cisco often tests the misconception that more epochs always improve model performance, when in fact excessive training leads to overfitting, and they expect candidates to recognize that monitoring loss curves and using early stopping are critical practices.

Full explanation →

439

MCQmedium

A healthcare company is deploying a machine learning model on Amazon SageMaker to analyze patient records. The model requires access to a DynamoDB table containing patient data. Which combination of AWS services and features should the company use to restrict access to only the necessary resources?

A.Attach a DynamoDB resource-based policy to the table allowing access from the SageMaker notebook

B.Create an IAM role with a policy granting read-only access to the specific DynamoDB table and attach it to the SageMaker notebook instance

C.Store AWS access keys in the notebook and use those credentials to access DynamoDB

D.Launch the SageMaker notebook in a VPC with a security group that allows access to DynamoDB

AnswerB

This follows least-privilege principle and uses temporary credentials via IAM roles.

Why this answer

Option B is correct because it follows the AWS principle of least privilege by creating an IAM role with a policy that grants read-only access to the specific DynamoDB table, then attaching that role to the SageMaker notebook instance. This ensures the notebook can only perform read operations on the required table without exposing long-term credentials or granting broader permissions.

Exam trap

Cisco often tests the misconception that DynamoDB supports resource-based policies like S3 bucket policies, but in reality DynamoDB only uses IAM identity-based policies for access control.

How to eliminate wrong answers

Option A is wrong because DynamoDB does not support resource-based policies; access control is managed exclusively through IAM policies, not by attaching policies directly to the table. Option C is wrong because storing AWS access keys in the notebook violates security best practices by introducing long-term credentials that can be leaked or misused, and SageMaker notebooks should use IAM roles for temporary credentials. Option D is wrong because a VPC with a security group controls network-level traffic but does not authenticate or authorize the SageMaker notebook to access DynamoDB; DynamoDB access requires IAM permissions regardless of network configuration.

Full explanation →

440

MCQmedium

A data scientist is evaluating foundation models for a text summarization task and wants to use a standard metric. Which metric is commonly used to assess the quality of generated summaries?

A.F1 score

B.ROUGE

C.BLEU

D.Accuracy

AnswerB

ROUGE measures recall-based overlap for summaries.

Why this answer

ROUGE is the standard metric for summarization, measuring overlap of n-grams. Option A (Accuracy) is for classification. Option C (BLEU) is for translation.

Option D (F1 score) is for classification.

Full explanation →

441

MCQhard

A healthcare company is using Amazon SageMaker to train and deploy a model that predicts patient readmission risk. The model uses sensitive protected health information (PHI). The company must ensure that data is encrypted at rest and in transit, and that access to the model endpoint is restricted to authorized applications only. The security team has configured AWS KMS customer managed keys for encryption, and IAM roles for SageMaker execution. However, during a security audit, it was discovered that the model endpoint is accessible from the internet and that the data used for training was stored in an S3 bucket with default encryption enabled. The compliance team requires that all PHI data be encrypted with a key that is rotated annually, and that no public access is allowed to the endpoint or training data. Which combination of actions should the ML engineer take to remediate these issues?

A.Use a SageMaker notebook instance with a lifecycle configuration to encrypt data with a customer managed KMS key, and restrict endpoint access using an IAM policy.

B.Enable S3 bucket encryption with SSE-S3, attach a bucket policy denying public access, and use an AWS Lambda function to rotate the S3 bucket key every year.

C.Apply SSE-KMS with an AWS managed key to the S3 bucket, and use a Lambda function to rotate the key every year. Disable public access to the endpoint using a VPC endpoint.

D.Enable S3 bucket encryption with a customer managed KMS key, disable public access on the SageMaker endpoint by deploying it in a VPC, and configure the KMS key to rotate annually.

AnswerD

Correct: Addresses all requirements with customer managed key, VPC endpoint, and key rotation.

Why this answer

Option D is correct because it addresses all compliance requirements: enabling S3 bucket encryption with a customer managed KMS key ensures PHI is encrypted at rest with a key that can be rotated annually, deploying the SageMaker endpoint in a VPC removes public internet access, and configuring annual KMS key rotation satisfies the rotation policy. This combination ensures encryption at rest and in transit (via VPC), restricts endpoint access to authorized applications only, and meets the key rotation requirement.

Exam trap

The trap here is that candidates confuse 'disabling public access' with 'using a VPC endpoint'—a VPC endpoint only allows private access to the endpoint from within the VPC, but the endpoint itself remains publicly accessible unless it is deployed inside a VPC with no internet gateway.

How to eliminate wrong answers

Option A is wrong because a SageMaker notebook instance with a lifecycle configuration does not encrypt data at rest in S3 or the endpoint, and restricting endpoint access via an IAM policy alone does not prevent public internet access—network-level controls like VPC are required. Option B is wrong because SSE-S3 uses AWS-managed keys that cannot be rotated annually by the customer, and a Lambda function cannot rotate an SSE-S3 key (S3 manages it automatically); also, it does not address endpoint public access. Option C is wrong because using an AWS managed key (SSE-KMS with AWS managed key) does not allow customer-controlled annual rotation—only customer managed KMS keys support customer-initiated rotation; additionally, disabling public access to the endpoint via a VPC endpoint is insufficient—the endpoint itself must be deployed in a VPC to remove internet exposure.

Full explanation →

442

Multi-Selectmedium

A company is using Amazon Bedrock to generate marketing copy. They want to ensure the output is safe and appropriate. Which TWO actions should they take? (Choose 2.)

Select 2 answers

A.Enable content filtering with guardrails

B.Set temperature to 0 for deterministic output

C.Use model fine-tuning with unsafe examples

D.Use a private endpoint for Bedrock

E.Implement human review of all generated content

AnswersA, E

Guardrails can block harmful or inappropriate content automatically.

Why this answer

Options A and D are correct. Guardrails filter content in real time, and human review catches subtle issues. Option B (fine-tuning with unsafe examples) could introduce bias.

Option C (low temperature) reduces creativity but does not ensure safety. Option E (private endpoint) addresses networking, not content safety.

Full explanation →

443

MCQeasy

A developer wants to generate product description images using Amazon Bedrock. They need to ensure the generated images match a specific brand style. Which feature should they primarily use?

A.Prompt engineering with detailed style descriptions.

B.Output grounding to verify brand compliance.

C.Data augmentation to increase dataset diversity.

D.Fine-tuning the image generation model on brand assets.

AnswerA

Prompt engineering is the simplest way to steer image generation toward a desired style.

Why this answer

Option A is correct because prompt engineering allows the developer to specify style guidelines in the text prompt, influencing the output. Option B is wrong because fine-tuning for image style is time-consuming. Option C is wrong because grounding is for text, not images.

Option D is wrong because data augmentation is not directly relevant.

Full explanation →

444

MCQhard

An organization wants to use Amazon Rekognition to analyze images of people for a security application. They must comply with GDPR. What is the best practice?

A.Store images indefinitely for audit

B.Use celebrity recognition

C.Ensure all images are anonymized before analysis

D.Use face detection only

AnswerC

Anonymizing images (e.g., blurring faces) helps comply with privacy regulations like GDPR.

Why this answer

Option C is correct because GDPR requires that personal data, including facial images, be processed lawfully and with appropriate safeguards. Anonymizing images before analysis with Amazon Rekognition ensures that the data cannot be linked back to an identifiable person, thereby reducing GDPR compliance risk. This aligns with the principle of data minimization and privacy by design.

Exam trap

AWS often tests the misconception that using a specific feature like celebrity recognition or face detection alone automatically satisfies compliance requirements, when in fact GDPR mandates data anonymization or pseudonymization as a best practice for processing biometric data.

How to eliminate wrong answers

Option A is wrong because storing images indefinitely violates GDPR's data retention limitation principle, which mandates that personal data be kept no longer than necessary for the processing purpose. Option B is wrong because celebrity recognition is designed to identify known public figures and does not address GDPR compliance for general image analysis; it may still process personal data without anonymization. Option D is wrong because face detection alone still processes biometric data that can be used to identify individuals, and without anonymization, it does not meet GDPR requirements for lawful processing.

Full explanation →

445

MCQmedium

A developer invoked an Amazon Bedrock model and received the following error: 'ValidationException: 1 validation error detected: Value 'claude-instant-v1' at 'modelId' failed to satisfy constraint: Member must satisfy enum value set: [ai21.j2-mid-v1, amazon.titan-text-lite-v1, anthropic.claude-v2, ...]'. What is the likely cause?

A.The Lambda function does not have the necessary IAM permissions

B.The modelId is not available in the current AWS region

C.The modelId is not part of the allowed enum of models for the account

D.The modelId is deprecated and has been renamed

AnswerC

The error explicitly states the value must satisfy the enum set, meaning the model ID is invalid or not in the allowed list.

Why this answer

Option C is correct because the error indicates the modelId 'claude-instant-v1' is not in the allowed enum set. This is usually because the model ID is incorrectly spelled or not available in this region/account. Option A (deprecated) would give a different message.

Option B (region availability) would mention region. Option D (permissions) would be a different error type.

Full explanation →

446

MCQhard

A team is deploying a generative AI model for medical report generation. They must ensure patient data privacy and comply with HIPAA. Which AWS service feature is essential for de-identifying protected health information (PHI) before sending data to a foundation model?

A.AWS CloudHSM

B.Amazon Comprehend Medical

C.Amazon Macie

D.AWS Key Management Service (AWS KMS)

AnswerB

Comprehend Medical provides PHI detection and de-identification.

Why this answer

Amazon Comprehend Medical is the correct service because it is specifically designed to extract and de-identify protected health information (PHI) from unstructured medical text using natural language processing (NLP). It can detect entities such as patient names, dates, and medical record numbers, and then redact or replace them before the data is sent to a foundation model, ensuring HIPAA compliance.

Exam trap

The trap here is that candidates confuse general data protection services like Macie or encryption services like KMS with the specialized PHI de-identification capability of Amazon Comprehend Medical, assuming any security service can handle HIPAA compliance for generative AI workflows.

How to eliminate wrong answers

Option A is wrong because AWS CloudHSM provides hardware security modules (HSMs) for cryptographic key storage and operations, but it does not perform data de-identification or PHI detection. Option C is wrong because Amazon Macie is a data security service that discovers and protects sensitive data using machine learning and pattern matching, but it is designed for data classification and access control, not for de-identifying PHI in unstructured text for downstream AI processing. Option D is wrong because AWS Key Management Service (AWS KMS) manages encryption keys for data at rest and in transit, but it does not have the capability to identify or remove PHI from text content.

Full explanation →

447

MCQeasy

A company uses Amazon Rekognition for facial analysis. They want to ensure the model doesn't exhibit bias based on skin tone. What should they do?

A.Ensure the training dataset includes diverse skin tones

B.Apply data augmentation to increase dataset size

C.Use a larger neural network

D.Use a pre-trained model from AWS Marketplace

AnswerA

Balanced representation mitigates bias.

Why this answer

Option D is correct: Training on diverse data reduces bias. Option A is wrong: Network size does not address bias. Option B is wrong: Data augmentation does not guarantee diversity.

Option C is wrong: Pre-trained models may have inherent bias.

Full explanation →

448

MCQmedium

A developer encounters the error shown above when using Amazon Bedrock. What is the most likely cause?

A.The model is not available in the region

B.The IAM role lacks the required permission

C.The request is throttled

D.The model is out of service

AnswerB

The error explicitly states the role is not authorized for the action.

Why this answer

The error indicates an access denied or authorization failure when invoking the Amazon Bedrock model. The most likely cause is that the IAM role used by the developer does not have the required permission, such as `bedrock:InvokeModel`, attached to its policy. Without this permission, the API call to Bedrock is rejected regardless of model availability or service status.

Exam trap

AWS often tests the distinction between service availability errors and authorization errors, so the trap here is that candidates may confuse a permissions failure with a model unavailability or throttling issue, especially when the error message is generic.

How to eliminate wrong answers

Option A is wrong because if the model were not available in the region, the error would typically be a `ModelNotFoundException` or `ValidationException`, not an access denied error. Option C is wrong because throttling errors return a `ThrottlingException` with HTTP 429 status code, not an authorization error. Option D is wrong because if the model were out of service, the error would be a `ServiceUnavailableException` or `ModelNotReadyException`, not a permissions-related error.

Full explanation →

449

MCQhard

A developer deployed this guardrail to block sensitive topics and sexual content. However, the model still generates responses about a specific sensitive topic that is not in the TopicPolicy. What should the developer do to prevent this?

A.Add a SensitiveInformationPolicy to filter PII

B.Increase the InputStrength of the content filter to MAX

C.Change the TopicPolicy Type from DENY to ALLOW

D.Add the specific topic to the TopicPolicy list

AnswerD

Adding the topic to the TopicPolicy with Type DENY will block it.

Why this answer

The guardrail's TopicPolicy only blocks the defined topic 'sensitive-topic'. To block additional topics, add them to the list. Option A (change type) would allow.

Option B (SensitiveInformationPolicy) is for PII. Option C (increase strength) does not add topics.

Full explanation →

450

MCQhard

A company is deploying a machine learning model for real-time fraud detection. The model must have latency under 100ms. Which infrastructure choice is most appropriate?

A.Amazon SageMaker real-time endpoints

B.Amazon EC2 with Deep Learning AMI

C.Amazon SageMaker batch transform

D.Amazon SageMaker notebook instance

AnswerA

Real-time endpoints provide low-latency inference with automatic scaling.

Why this answer

Amazon SageMaker real-time endpoints are designed for low-latency inference, typically in the tens of milliseconds, making them suitable for real-time fraud detection where latency must be under 100ms. They deploy a model behind a persistent HTTPS endpoint that auto-scales to handle incoming requests with minimal delay.

Exam trap

The trap here is that candidates often confuse batch transform with real-time inference, assuming that any SageMaker inference capability can serve low-latency requests, but batch transform is explicitly asynchronous and designed for high-throughput, not low-latency.

How to eliminate wrong answers

Option B is wrong because Amazon EC2 with Deep Learning AMI requires manual setup of the inference server, scaling, and load balancing, which introduces operational overhead and cannot guarantee sub-100ms latency without significant custom engineering. Option C is wrong because Amazon SageMaker batch transform is designed for asynchronous, offline inference on large datasets, not for real-time, low-latency predictions. Option D is wrong because Amazon SageMaker notebook instance is an interactive development environment for building and testing models, not a production inference endpoint.

Full explanation →

Page 6 of 7

All pages

Practice AIF-C01 by domain

Target a specific domain to shore up weak areas.

Applications of Foundation Models Fundamentals of AI and ML Fundamentals of Generative AI Guidelines for Responsible AI Security, Compliance and Governance for AI Solutions

See all domains with question counts →