Knowledge + Practice

AWS Certified AI Practitioner AIF-C01 (AIF-C01) — Questions 151–225

500 questions total · 7pages · All types, answers revealed

Take a mock exam Exam hub

Page 3 of 7

151

Multi-Selectmedium

A company is building a chatbot using Amazon Bedrock. They want to provide up-to-date information from a continuously changing database. Which TWO services can be used as a data source for a Bedrock knowledge base? (Select TWO.)

Select 2 answers

A.Amazon Kendra

B.Amazon S3

C.Amazon RDS for MySQL

D.Amazon DynamoDB

E.AWS Glue

AnswersA, B

A Kendra index can be used as a knowledge base source in Bedrock.

Why this answer

Amazon Bedrock knowledge bases can directly ingest data from Amazon S3, which is a supported data source for indexing documents. Amazon Kendra is also a supported data source, allowing Bedrock to leverage existing Kendra indexes for retrieval-augmented generation (RAG). Both services integrate natively with Bedrock knowledge bases to provide up-to-date information from continuously changing data.

Exam trap

AWS often tests the misconception that any AWS database or data processing service can be a direct data source for Bedrock knowledge bases, but only S3, Kendra, and Salesforce are supported.

Full explanation →

152

Multi-Selecteasy

Which TWO actions are best practices for reducing hallucinations in generative AI models? (Choose 2)

Select 2 answers

A.Increase the model size

B.Fine-tune the model on proprietary data

C.Use retrieval-augmented generation (RAG)

D.Use a smaller model to limit complexity

E.Apply prompt engineering with clear instructions and constraints

AnswersC, E

RAG grounds responses in retrieved documents, improving factual accuracy.

Why this answer

RAG provides factual grounding, and prompt engineering with clear constraints reduces hallucinations. Increasing model size (A) may not reduce hallucinations. Fine-tuning (C) can help but is not a direct best practice for all cases; RAG and prompt engineering are more effective.

Using a smaller model (E) may increase hallucinations.

Full explanation →

153

MCQhard

A marketing firm uses Amazon Bedrock to generate ad copy. They notice that the generated text often includes factual inaccuracies about their products. Which technique would most effectively reduce these inaccuracies?

A.Implement Retrieval-Augmented Generation (RAG) with a product knowledge base.

B.Use longer, more detailed prompts.

C.Increase the temperature parameter to 0.9.

D.Fine-tune the model on a dataset of previous ad copies.

AnswerA

RAG enables the model to retrieve and cite authoritative information, reducing hallucinations.

Why this answer

Retrieval-Augmented Generation (RAG) grounds the model's output in a trusted, external knowledge base by retrieving relevant product documents before generating text. This directly addresses factual inaccuracies because the model references authoritative data rather than relying solely on its parametric memory, which may contain outdated or incorrect information.

Exam trap

Cisco often tests the misconception that fine-tuning or prompt engineering alone can fix factual accuracy issues, when in reality RAG is the standard solution for grounding model outputs in external, verifiable data.

How to eliminate wrong answers

Option B is wrong because longer prompts do not fix the underlying knowledge gap; they only provide more context but cannot inject new, accurate facts that the model lacks. Option C is wrong because increasing temperature to 0.9 increases randomness and creativity, which would likely worsen factual inaccuracies by encouraging more hallucinated or divergent outputs. Option D is wrong because fine-tuning on previous ad copies would reinforce existing patterns and biases, including any inaccuracies present in the training data, rather than introducing a reliable source of truth.

Full explanation →

154

MCQhard

A company operates a customer support chatbot that uses Amazon Bedrock with a knowledge base sourced from an S3 bucket containing frequently updated product documentation. The knowledge base uses OpenSearch Serverless as the vector store and is configured to sync daily. The chatbot uses the RetrieveAndGenerate API with a custom Lambda function that applies a system prompt instructing the model to base answers solely on the retrieved context. After a major update to the product documentation, the IT team verifies that the data source sync completed successfully and the new chunks are present in the OpenSearch index. However, the chatbot continues to respond with outdated information. Further investigation reveals that the Lambda function includes a response caching mechanism using Amazon ElastiCache for Redis with a Time-To-Live (TTL) of 24 hours. The cache key is based on the user query. The team notes that no cache invalidation is performed after documentation updates. What is the most likely cause of the outdated responses?

A.The ElastiCache cache is returning stale cached responses that contain the old information.

B.The 'maximum results' parameter in the RetrieveAndGenerate API is set to a value too low to retrieve the new chunks.

C.The embedding model used by the knowledge base has not been retrained on the new documentation.

D.The IAM role for the Lambda function lacks permissions to access the new S3 objects.

AnswerA

Correct. The cache is not invalidated on document updates, so identical queries return cached old responses.

Why this answer

Since the data source sync succeeded and the index contains new chunks, the retrieval should be able to access the latest data. However, the Lambda function caches responses keyed by query. With a 24-hour TTL and no invalidation, the cache returns stale responses containing the old safety information.

Clearing the cache or reducing TTL would resolve the issue. The other options are less likely: low max results might cause missing new chunks but would not consistently return old info; the embedding model is not retrained per sync; IAM permissions would affect sync, not retrieval.

Full explanation →

155

Multi-Selecteasy

A data science team is using Amazon SageMaker to build a model. They want to ensure that only authorized users can deploy models to production. Which TWO methods can they use to enforce this?

Select 2 answers

A.Use SageMaker Model Registry to require approval before deployment.

B.Enable multi-factor authentication (MFA) for all AWS accounts.

C.Use IAM policies to restrict the sagemaker:CreateEndpoint action to specific users.

D.Use AWS CloudTrail to audit deployment actions.

E.Use Amazon GuardDuty to monitor for unauthorized deployment.

AnswersA, C

Model Registry can enforce an approval workflow before a model is deployed.

Why this answer

Option A is correct because SageMaker Model Registry allows you to set up an approval workflow for model versions. By requiring explicit approval before a model can be deployed to production, you enforce a governance gate that prevents unauthorized or unverified models from being used in production endpoints.

Exam trap

The trap here is that candidates often confuse auditing or monitoring services (like CloudTrail or GuardDuty) with preventive controls, failing to recognize that only IAM policies and registry approval workflows can actively block unauthorized deployment actions.

Full explanation →

156

MCQhard

Refer to the exhibit. A data scientist is training a neural network model on SageMaker. The training log shows the loss values per epoch. Which issue is most likely occurring?

A.The number of epochs is insufficient

B.The model is overfitting

C.The dataset is too small

D.The learning rate is too low

AnswerB

Overfitting causes training loss to increase after a point.

Why this answer

The training log shows loss values decreasing on the training set but increasing or plateauing on the validation set, which is a classic sign of overfitting. Overfitting occurs when the model learns noise and specific patterns in the training data too well, failing to generalize to unseen data. In SageMaker, monitoring both training and validation loss curves is critical to detect this issue early.

Exam trap

Cisco often tests the distinction between overfitting and underfitting by showing loss curves where training loss decreases but validation loss increases, leading candidates to mistakenly attribute the issue to insufficient epochs or a low learning rate.

How to eliminate wrong answers

Option A is wrong because insufficient epochs would typically show both training and validation loss still decreasing at the end of training, not a divergence. Option C is wrong because a dataset that is too small can contribute to overfitting, but the direct symptom shown in the loss curves (training loss decreasing while validation loss increases) is specifically overfitting, not merely a small dataset. Option D is wrong because a learning rate that is too low would cause both training and validation loss to decrease very slowly or plateau at a high value, not diverge.

Full explanation →

157

MCQhard

A media company is using Amazon Bedrock to generate marketing copy with a foundation model. They want to ensure the output adheres to brand voice guidelines (e.g., friendly, professional). Which prompt engineering strategy is most effective for this requirement?

A.Provide five example outputs in the prompt that match the desired tone.

B.Include instructions like 'Do not use technical jargon' in every user prompt.

C.Set the temperature parameter to a low value (e.g., 0.1) to reduce randomness.

D.Use a system prompt that explicitly describes the brand voice and expectations.

AnswerD

System prompts set the role and tone, effectively guiding the model's style for all subsequent interactions.

Why this answer

Option D is correct because Amazon Bedrock supports system prompts that set overarching context and behavioral guidelines for the model. By explicitly describing the brand voice (e.g., 'friendly, professional') in the system prompt, the model consistently applies these constraints across all user interactions, which is more effective than per-instruction tuning.

Exam trap

AWS often tests the misconception that parameter tuning (like temperature) or few-shot examples are sufficient for style control, when in fact system prompts provide the most direct and scalable mechanism for enforcing behavioral constraints in foundation models.

How to eliminate wrong answers

Option A is wrong because providing example outputs (few-shot prompting) can guide tone but is less reliable than a system prompt for consistent adherence across diverse inputs, and it consumes prompt token budget without guaranteeing the model internalizes the rule. Option B is wrong because including instructions like 'Do not use technical jargon' in every user prompt is redundant, inefficient, and can be overridden by the model's tendency to follow the most recent instruction, whereas a system prompt sets a persistent baseline. Option C is wrong because lowering the temperature parameter reduces randomness but does not enforce specific brand voice constraints; it only makes outputs more deterministic, which may still produce off-tone content if the model's training data lacks the desired style.

Full explanation →

158

MCQhard

A data scientist is unable to invoke the Claude v2 model from an EC2 instance with IP 10.0.1.5. What is the most likely reason?

A.The condition is too restrictive

B.The policy does not allow the ec2:AssociateAddress action

C.The resource ARN is incorrect because it should not include the account ID

D.The model is not available in us-east-1

AnswerC

Bedrock model ARNs omit the account ID; having it results in an invalid resource.

Why this answer

Option B is correct because the resource ARN is malformed. Amazon Bedrock model ARNs do not include the account ID; the correct format is arn:aws:bedrock:region::foundation-model/anthropic.claude-v2. Option A is not an action.

Option C is incorrect because the IP is within the allowed range. Option D is incorrect because the model is available in us-east-1.

Full explanation →

159

MCQeasy

A data scientist at a retail company is tasked with building a model to predict customer churn. The dataset contains 100,000 records with features such as age, purchase history, customer support interactions, and a binary label indicating whether the customer churned in the past. The team needs a model that can be deployed for real-time inference with low latency. They have limited time and want to use a built-in algorithm from Amazon SageMaker that is optimized for classification tasks. Which approach should they take?

A.Use Amazon SageMaker PCA algorithm

B.Use Amazon SageMaker XGBoost algorithm

C.Use Amazon SageMaker K-Means algorithm

D.Use Amazon SageMaker BlazingText algorithm

AnswerB

XGBoost is a built-in algorithm for classification and works well with tabular data.

Why this answer

Amazon SageMaker's built-in XGBoost algorithm is optimized for classification tasks like binary churn prediction, supports real-time inference with low latency via SageMaker endpoints, and can handle the dataset size of 100,000 records efficiently. It is a supervised learning algorithm that directly uses the binary label for training, making it the correct choice for this scenario.

Exam trap

The trap here is that candidates may confuse unsupervised algorithms (PCA, K-Means) or domain-specific algorithms (BlazingText for text) with general-purpose supervised classification algorithms, overlooking that XGBoost is the only built-in SageMaker algorithm among the options designed for tabular classification with real-time inference needs.

How to eliminate wrong answers

Option A is wrong because PCA (Principal Component Analysis) is an unsupervised dimensionality reduction algorithm, not a classification algorithm, and cannot predict churn from a binary label. Option C is wrong because K-Means is an unsupervised clustering algorithm used for grouping data, not for supervised classification tasks like churn prediction. Option D is wrong because BlazingText is optimized for text classification and word embeddings, not for tabular data with features like age and purchase history.

Full explanation →

160

Multi-Selecthard

A financial services company uses Amazon SageMaker Feature Store to manage features for machine learning models. The compliance auditor requires that all changes to feature definitions are logged and that feature data is immutable once written. Which TWO approaches should the team implement? (Choose two.)

Select 2 answers

A.Enable AWS CloudTrail for SageMaker Feature Store API calls.

B.Use SageMaker Feature Store offline store with record identifier and event time.

C.Enable feature group versioning to track changes to feature definitions.

D.Implement feature store online store with TTL to automatically expire data.

E.Use AWS Config to track changes to Feature Store resources.

AnswersA, C

CloudTrail logs all API calls, providing an audit trail for changes.

Why this answer

Option A is correct because enabling AWS CloudTrail for SageMaker Feature Store API calls provides a detailed audit log of all operations, including changes to feature definitions (e.g., CreateFeatureGroup, UpdateFeatureGroup). This satisfies the compliance requirement for logging all changes. Option C is correct because enabling feature group versioning in SageMaker Feature Store allows you to track and manage changes to feature definitions over time, ensuring a historical record of modifications.

Exam trap

The trap here is that candidates often confuse AWS Config (which tracks resource configuration changes) with AWS CloudTrail (which logs API calls), or they mistakenly think the offline store's point-in-time query capability inherently enforces data immutability, when in fact immutability requires explicit design choices.

Full explanation →

161

Multi-Selectmedium

A company is using Amazon Bedrock to deploy a chatbot. They want to ensure that the chatbot does not produce harmful or biased content. Which TWO AWS services or features can be used together to implement content moderation and monitoring?

Select 2 answers

A.Amazon SageMaker Model Monitor for drift detection

B.Amazon Rekognition for image moderation

C.Amazon Comprehend for sentiment analysis

D.Amazon Bedrock Guardrails for content filtering

E.Amazon CloudWatch Logs for logging and analyzing model outputs

AnswersD, E

Guardrails can filter harmful, toxic, or biased content in model responses.

Why this answer

Amazon Bedrock Guardrails (Option D) is correct because it provides built-in content filtering capabilities specifically designed for foundation models, allowing you to define denied topics, filter harmful content, and enforce safety policies directly within the Bedrock chatbot workflow. Amazon CloudWatch Logs (Option E) is correct because it enables you to log model inputs and outputs, which can be monitored for compliance, audited for bias, and used to trigger alerts when harmful content is detected. Together, they form a comprehensive content moderation and monitoring solution that addresses both proactive filtering and reactive analysis.

Exam trap

Cisco often tests the distinction between monitoring (CloudWatch Logs) and analysis (Comprehend, Rekognition) versus enforcement (Guardrails), leading candidates to mistakenly choose services that only analyze or detect content without the ability to block or filter it in real time.

Full explanation →

162

MCQmedium

A healthcare company uses Amazon Bedrock to generate patient summaries. They need to ensure no protected health information (PHI) is leaked in the output. Which AWS service can they use to detect and mask PHI in text?

A.Amazon Comprehend Medical

B.Amazon Macie

C.AWS Glue

D.Amazon Rekognition

AnswerA

Comprehend Medical can identify and mask PHI such as patient names and dates.

Why this answer

Amazon Comprehend Medical is specifically designed to extract and identify protected health information (PHI) from unstructured medical text using natural language processing (NLP). It can detect entities such as patient names, dates, medical conditions, and medications, and provides APIs to mask or redact that PHI before output. This makes it the correct choice for the healthcare company's requirement to prevent PHI leakage in patient summaries generated by Amazon Bedrock.

Exam trap

AWS often tests the distinction between general-purpose data protection services (like Macie) and domain-specific medical NLP services (like Comprehend Medical), leading candidates to choose Macie because it is associated with sensitive data discovery, even though it cannot perform inline text masking.

How to eliminate wrong answers

Option B (Amazon Macie) is wrong because Macie is a data security service that discovers and protects sensitive data stored in Amazon S3 using machine learning and pattern matching, but it does not provide real-time PHI detection or masking in text streams or API outputs. Option C (AWS Glue) is wrong because Glue is a serverless data integration service for ETL (extract, transform, load) jobs, not a text analysis or PHI detection service. Option D (Amazon Rekognition) is wrong because Rekognition is an image and video analysis service that can detect objects, faces, and text in media, but it is not designed to identify or mask PHI in textual data.

Full explanation →

163

MCQmedium

A healthcare startup deploys a model to predict patient readmission risk using Amazon SageMaker. After deployment, the model shows higher false-positive rates for a specific age group. What is the most responsible first step?

A.Increase the prediction threshold for the affected group

B.Use Amazon SageMaker Clarify to detect bias in predictions

C.Retrain the model with more data from the affected group

D.Immediately retire the model to prevent harm

AnswerB

Clarify provides bias metrics to inform next steps.

Why this answer

Amazon SageMaker Clarify is purpose-built for detecting bias in ML models and data. It provides bias metrics (e.g., Difference in Positive Proportions in Predicted Labels, Disparate Impact) that can quantify whether the model's predictions are systematically skewed against a specific age group. This is the most responsible first step because it objectively measures the bias before any corrective action is taken.

Exam trap

AWS often tests the misconception that the first step to address bias is to immediately retrain or adjust thresholds, rather than using a dedicated bias detection tool like SageMaker Clarify to first diagnose the nature and extent of the bias.

How to eliminate wrong answers

Option A is wrong because increasing the prediction threshold for the affected group is a post-hoc adjustment that does not address the root cause of bias and can introduce new fairness issues or degrade overall model performance. Option C is wrong because retraining with more data from the affected group assumes the bias stems from data imbalance, but without first using SageMaker Clarify to confirm the bias source, this could be ineffective or even harmful (e.g., if bias is due to feature encoding or labeling). Option D is wrong because immediately retiring the model is an overreaction that ignores the possibility of mitigation; responsible AI practices require diagnosis before drastic action.

Full explanation →

164

MCQeasy

A startup needs to generate product descriptions from bullet points using a foundation model. They want a fully managed serverless experience. Which AWS service should they use?

A.Amazon Comprehend

B.Amazon Bedrock

C.Amazon Polly

D.Amazon Lex

AnswerB

Bedrock offers serverless foundation models for generation tasks.

Why this answer

Amazon Bedrock is a fully managed serverless service that provides access to foundation models (FMs) from leading AI providers via an API, making it ideal for generating product descriptions from bullet points. It eliminates infrastructure management while allowing you to invoke models like Anthropic Claude or Amazon Titan for text generation tasks.

Exam trap

The trap here is that candidates confuse Amazon Comprehend (a text analysis service) with a generative AI service, or assume Polly or Lex can generate text descriptions when they are specialized for speech and conversation, respectively.

How to eliminate wrong answers

Option A is wrong because Amazon Comprehend is a natural language processing (NLP) service for extracting insights (e.g., sentiment, entities) from text, not for generating new content from bullet points. Option C is wrong because Amazon Polly is a text-to-speech service that converts text into lifelike speech, not a foundation model for text generation. Option D is wrong because Amazon Lex is a service for building conversational interfaces (chatbots) using automatic speech recognition and natural language understanding, not for generating product descriptions from bullet points.

Full explanation →

165

MCQmedium

A healthcare company processes patient records using a foundation model on Amazon Bedrock. They must ensure no patient data is used to improve the base model. What is the MOST effective configuration?

A.Create a VPC endpoint for Bedrock

B.Set a data retention policy to store logs for only 30 days

C.Disable model improvement logging in the Bedrock service settings

D.Enable data encryption at rest and in transit

AnswerC

Disabling model improvement prevents AWS from using the data to improve the base model.

Why this answer

Option C is correct because disabling model improvement logging ensures data is not used for training. Option A (data encryption) does not prevent usage for training. Option B (private endpoints) secures traffic but not data usage.

Option D (retain logs) could still allow AWS to use data for model improvement.

Full explanation →

166

Multi-Selectmedium

Which TWO of the following are examples of supervised learning tasks that can be performed using Amazon SageMaker built-in algorithms?

Select 2 answers

A.Principal Component Analysis (PCA)

B.XGBoost

C.Linear Learner

D.Latent Dirichlet Allocation (LDA)

E.K-Means

AnswersB, C

XGBoost is a supervised gradient boosting algorithm.

Why this answer

XGBoost is a supervised learning algorithm that uses gradient-boosted decision trees for regression, classification, and ranking tasks. Amazon SageMaker's built-in XGBoost algorithm is optimized for distributed training and directly supports labeled training data, making it a correct example of a supervised learning task.

Exam trap

Cisco often tests the distinction between supervised and unsupervised learning by listing algorithms like PCA, LDA, and K-Means alongside supervised ones, trapping candidates who recognize the algorithm names but forget their learning paradigm.

Full explanation →

167

MCQmedium

A data science team needs to grant a SageMaker notebook instance access to an S3 bucket containing training data. Which IAM policy should be attached to the notebook instance's execution role?

A.An IAM policy that allows the sagemaker:InvokeEndpoint action

B.An IAM policy that allows kms:Decrypt on the S3 bucket's KMS key

C.An IAM policy that allows ec2:DescribeVpcs

D.An IAM policy that allows s3:GetObject and s3:ListBucket on the specific bucket

AnswerD

The notebook instance's execution role needs these permissions to read training data.

Why this answer

The correct answer is D because a SageMaker notebook instance needs read access to the S3 bucket containing training data. The execution role must have an IAM policy that explicitly allows s3:GetObject (to read objects) and s3:ListBucket (to list the bucket contents) on the specific bucket. Without these permissions, the notebook cannot retrieve the training data.

Exam trap

Cisco often tests the distinction between the permissions needed for data access (S3 actions) versus inference (SageMaker actions) or encryption (KMS actions), leading candidates to overcomplicate by adding unnecessary permissions like kms:Decrypt when the question does not mention encryption.

How to eliminate wrong answers

Option A is wrong because sagemaker:InvokeEndpoint is used to invoke a deployed SageMaker endpoint for inference, not to access S3 training data. Option B is wrong because while kms:Decrypt may be needed if the S3 bucket uses KMS encryption, it is not the primary permission required; the core need is s3:GetObject and s3:ListBucket. Option C is wrong because ec2:DescribeVpcs is used for VPC networking operations, not for S3 data access.

Full explanation →

168

MCQmedium

A healthcare organization uses an AI model to predict patient readmission risks. The model's predictions are used by doctors to allocate follow-up care. The organization wants to ensure compliance with responsible AI guidelines. Which practice best supports explainability?

A.Ensuring the model's overall accuracy exceeds 95%

B.Using a black-box ensemble model that achieves highest accuracy

C.Automating decisions without human review to reduce bias

D.Providing feature importance scores for each prediction

AnswerD

Feature importance helps stakeholders understand model reasoning, supporting explainability.

Why this answer

Providing feature importance scores helps doctors understand why the model made a specific prediction, aligning with explainability. Removing humans oversimplifies accountability. Using a black-box model reduces explainability.

Relying solely on accuracy ignores interpretability.

Full explanation →

169

MCQeasy

A company wants to use a pre-trained generative AI model to analyze customer feedback. They need to adjust the model for their specific domain without retraining from scratch. Which approach is MOST suitable?

A.Fine-tuning the model on domain-specific data

B.Reinforcement Learning from Human Feedback (RLHF)

C.Training a new model from scratch on the domain data

D.Using prompt engineering to provide context

AnswerA

Fine-tuning is efficient for domain adaptation using pre-trained models.

Why this answer

Fine-tuning is the most suitable approach because it takes a pre-trained generative AI model and updates its weights using a smaller, domain-specific dataset (e.g., customer feedback transcripts). This allows the model to adapt to the company's specific terminology, sentiment patterns, and context without the massive computational cost and data requirements of training from scratch. It preserves the general language understanding from pre-training while specializing the model for the target domain.

Exam trap

Cisco often tests the distinction between prompt engineering (a zero-shot or few-shot method that does not modify the model) and fine-tuning (which updates model weights), leading candidates to mistakenly choose prompt engineering as a simpler but insufficient solution for deep domain adaptation.

How to eliminate wrong answers

Option B (RLHF) is wrong because RLHF is a technique used to align model outputs with human preferences through reward modeling, not primarily for domain adaptation; it requires a separate reward model and human feedback loop, making it overkill and less direct for simply specializing on domain-specific data. Option C (training a new model from scratch) is wrong because it discards the benefits of pre-training, requiring enormous amounts of domain data and compute resources, which contradicts the requirement to avoid retraining from scratch. Option D (prompt engineering) is wrong because while it can provide context at inference time, it does not adjust the model's internal weights or permanently adapt it to the domain; it relies on the model's existing knowledge and may fail for nuanced or rare domain-specific terms.

Full explanation →

170

Multi-Selectmedium

A company is using Amazon Bedrock to generate images. They want to ensure the outputs comply with content policies. Which TWO AWS services can help? (Choose two.)

Select 2 answers

A.Amazon Augmented AI (A2I)

B.Amazon Rekognition

C.Amazon GuardDuty

D.AWS WAF

E.Amazon Comprehend

AnswersA, B

A2I enables human review of flagged images to enforce content policies.

Why this answer

Amazon Augmented AI (A2I) is correct because it enables human review of model predictions to ensure compliance with content policies. For image generation on Bedrock, A2I can route outputs that fall below a confidence threshold to human reviewers, providing a safety net for policy adherence.

Exam trap

AWS often tests the distinction between services that analyze images (Rekognition) versus text (Comprehend), and candidates may mistakenly choose Comprehend for image content moderation without recognizing it is NLP-only.

Full explanation →

171

MCQeasy

A data scientist wants to detect potential bias in a binary classification model before deployment. Which AWS service can analyze the model's predictions across different demographic groups?

A.Amazon SageMaker Ground Truth

B.Amazon CloudWatch Logs Insights

C.Amazon SageMaker Clarify

D.Amazon SageMaker Model Monitor

AnswerC

SageMaker Clarify is specifically designed for bias detection and explainability.

Why this answer

Option A is correct because Amazon SageMaker Clarify provides built-in bias metrics and explainability analysis for machine learning models. Options B, C, D are incorrect: SageMaker Model Monitor is for post-deployment monitoring; SageMaker Ground Truth is for labeling; CloudWatch Logs is for logging.

Full explanation →

172

MCQhard

A company wants to deploy a real-time inference endpoint for a custom model on SageMaker. The model has high latency (100ms) and they need to handle variable traffic with spikes. Which deployment strategy is most cost-effective?

A.Deploy on a SageMaker multi-model endpoint

B.Use batch transform

C.Deploy on a single SageMaker endpoint with automatic scaling

D.Use a single large instance type

AnswerC

Automatic scaling adds instances based on load, providing cost-effective handling of variable traffic.

Why this answer

Option C is correct because a single SageMaker endpoint with automatic scaling allows the endpoint to dynamically adjust the number of instances based on traffic patterns, handling variable traffic and spikes cost-effectively. For a model with 100ms latency, automatic scaling can add instances during spikes and remove them during low traffic, ensuring you only pay for the compute resources you use while maintaining low inference latency.

Exam trap

The trap here is that candidates often confuse multi-model endpoints with cost-effective scaling for a single model, not realizing that multi-model endpoints are designed for hosting many models, not for handling variable traffic for one model with high latency.

How to eliminate wrong answers

Option A is wrong because a multi-model endpoint is designed to host multiple models on a shared instance to reduce costs, but it does not inherently handle high-latency models (100ms) well under variable traffic spikes, as it can lead to resource contention and increased latency. Option B is wrong because batch transform is an asynchronous, offline inference method for processing large datasets in batches, not suitable for real-time inference endpoints that require immediate responses. Option D is wrong because using a single large instance type is not cost-effective for variable traffic with spikes; you would over-provision for peak traffic and pay for idle capacity during low traffic, whereas automatic scaling adjusts resources dynamically.

Full explanation →

173

MCQeasy

A company is building a customer service chatbot using Amazon Bedrock. Which component of a foundation model determines the creativity and randomness of the generated responses?

A.Prompt template

B.Temperature

C.Max tokens

D.Top-p

AnswerB

Temperature scales the logits before softmax, controlling randomness. Lower values make outputs more deterministic.

Why this answer

The temperature parameter controls randomness. Higher values (e.g., >1) produce more creative but less focused outputs, while lower values (e.g., near 0) produce more deterministic responses.

Full explanation →

174

MCQhard

A healthcare startup is using Amazon Bedrock to generate clinical notes. They must prevent the model from outputting any personally identifiable information (PII) such as patient names. What is the most effective approach?

A.Fine-tune the model on de-identified data only

B.Configure a guardrail in Amazon Bedrock to deny PII topics

C.Use a prompt engineering technique to instruct the model to avoid PII

D.Post-process the output with a regex filter

AnswerB

Guardrails provide robust content filtering that can detect and block PII, making this the most effective approach.

Why this answer

Option B is correct because Amazon Bedrock guardrails provide content filtering that can deny PII topics and block sensitive information. Option A (prompt engineering) can be bypassed. Option C (fine-tuning on de-identified data) is costly and not guaranteed.

Option D (regex post-processing) is brittle and incomplete.

Full explanation →

175

MCQmedium

During model training, the loss decreases rapidly for the first few epochs and then plateaus. The validation loss starts increasing after some epochs. What should the team do to improve generalization?

A.Increase learning rate

B.Early stopping

C.Increase training epochs

D.Add more layers

AnswerB

Early stopping stops training before overfitting occurs.

Why this answer

The validation loss increasing while training loss continues to decrease is a classic sign of overfitting. Early stopping (Option B) halts training when validation performance stops improving, preventing the model from memorizing noise in the training data and thereby improving generalization.

Exam trap

Cisco often tests the misconception that overfitting is solved by increasing model complexity or training longer, when in fact the opposite is true—early stopping or regularization techniques are required to curb overfitting.

How to eliminate wrong answers

Option A is wrong because increasing the learning rate would cause larger weight updates, potentially overshooting minima and destabilizing training, which does not address overfitting. Option C is wrong because increasing training epochs would allow the model to continue fitting the training data even more closely, exacerbating overfitting rather than improving generalization. Option D is wrong because adding more layers increases model capacity, making it more prone to overfitting on the training data, not less.

Full explanation →

176

Multi-Selectmedium

Which TWO of the following are best practices for data preprocessing in machine learning? (Select TWO.)

Select 2 answers

A.Use cross-validation to evaluate model performance

B.Always split data 80/20 for training and testing

C.One-hot encoding for ordinal categories

D.Feature scaling for gradient-based algorithms

E.Drop duplicate records only if they are manual entry errors

AnswersA, D

Cross-validation provides a more reliable estimate of model generalization.

Why this answer

Cross-validation is a best practice for evaluating model performance because it provides a more robust estimate of how the model will generalize to unseen data by partitioning the data into multiple training and validation sets. This reduces the variance associated with a single train-test split and helps detect overfitting, making it a standard technique in machine learning workflows.

Exam trap

AWS often tests the misconception that one-hot encoding is universally applicable to all categorical data, but the trap here is that candidates forget ordinal categories have a natural order that one-hot encoding discards, leading to loss of information and potentially worse model performance.

Full explanation →

177

MCQmedium

A multinational corporation uses a foundation model via Amazon Bedrock to translate internal communication documents from English to multiple languages. They notice that the translations often miss company-specific jargon and acronyms, leading to confusion. The company has a glossary of approved translations for terms like 'Project Atlas' and 'Operation Synergy.' They want to improve translation accuracy quickly and with minimal effort. What approach should they take?

A.Use prompt engineering to include the glossary in each translation request.

B.Use a larger foundation model that has better language understanding.

C.Fine-tune the foundation model on a corpus of bilingual company documents.

D.Switch to Amazon Translate with custom terminology.

AnswerA

Including the glossary in the prompt directly informs the model of the correct translations.

Why this answer

Option B is correct because including the glossary in the prompt is a simple and effective method: the model can use the provided translations for specific terms. Option A (fine-tuning) requires data preparation and training. Option C (Amazon Translate with custom terminology) is a different service not using the FM.

Option D (larger model) may not address specific jargon.

Full explanation →

178

MCQmedium

A developer is using Amazon Bedrock to generate text summaries. The output sometimes includes irrelevant information. What is the most effective prompt engineering technique to improve relevance?

A.Add a negative prompt specifying what to avoid

B.Use few-shot examples with summaries

C.Increase max tokens

D.Decrease temperature

AnswerB

Few-shot examples show the model desired output patterns, improving relevance.

Why this answer

Few-shot examples provide the model with explicit patterns of desired output, directly guiding it to produce summaries that match the format and content of the examples. This technique is the most effective for improving relevance because it gives the model concrete reference points, reducing the likelihood of including irrelevant information.

Exam trap

AWS often tests the misconception that adjusting generation parameters (like temperature or token limits) can substitute for explicit prompt structure, when in fact few-shot examples directly teach the model the expected output format and content relevance.

How to eliminate wrong answers

Option A is wrong because negative prompts (e.g., 'avoid irrelevant details') are less reliable in foundation models; they can be ignored or misinterpreted, and they do not provide the structured guidance that few-shot examples offer. Option C is wrong because increasing max tokens only expands the output length, which can actually increase the chance of including irrelevant information rather than improving relevance. Option D is wrong because decreasing temperature reduces randomness but does not teach the model what relevant content looks like; it may still produce irrelevant information if the prompt lacks clear examples.

Full explanation →

179

MCQmedium

A machine learning team needs to share a trained model with multiple teams across different AWS accounts. The model artifacts are stored in an S3 bucket in the central account. What is the most secure way to grant cross-account read access to the model artifacts?

A.Use an S3 bucket policy that grants access to the root user of each target account.

B.Make the S3 bucket public.

C.Use S3 cross-region replication to copy the artifacts to each target account's bucket.

D.Use AWS KMS to encrypt the artifacts and share the KMS key with the target accounts, then use bucket policies and IAM roles in the target accounts.

AnswerD

This ensures least privilege and encryption in transit and at rest.

Why this answer

Option D is correct because it implements a defense-in-depth approach: AWS KMS encrypts the model artifacts at rest, and cross-account access is granted by combining an S3 bucket policy that allows the target accounts' IAM roles to read the objects, with those roles assuming the necessary permissions. This ensures that only authenticated and authorized IAM principals in the target accounts can decrypt and access the artifacts, preventing unauthorized access even if the bucket policy is misconfigured.

Exam trap

The trap here is that candidates often assume S3 bucket policies alone are sufficient for cross-account access, overlooking the need for KMS key policies and IAM role permissions when encryption is involved, which is a common real-world requirement for securing sensitive ML artifacts.

How to eliminate wrong answers

Option A is wrong because granting access to the root user of each target account is overly permissive and violates the principle of least privilege; root users have unrestricted access and cannot be audited per specific IAM role or user. Option B is wrong because making the S3 bucket public exposes the model artifacts to anyone on the internet, completely bypassing authentication and authorization, which is insecure for sensitive ML artifacts. Option C is wrong because S3 cross-region replication only copies objects to another bucket; it does not grant cross-account read access to the original bucket, and the replicated objects would still require separate permissions in the target account, adding complexity without solving the access control problem.

Full explanation →

180

MCQhard

Refer to the exhibit. A data scientist runs SageMaker Clarify on a training dataset and receives the above JSON output. Which bias metric exceeds its threshold?

A.DPL

B.Label Imbalance

C.Class Imbalance

D.All metrics exceed thresholds

AnswerA

DPL (Difference in Positive Proportions in Predicted Labels) exceeds threshold.

Why this answer

The DPL value of 0.15 is greater than the threshold of 0.1, indicating significant bias. Class imbalance (0.3) and label imbalance (0.4) are below their respective thresholds.

Full explanation →

181

MCQmedium

A retail company uses a machine learning model to forecast daily product demand. The model is a time series model that uses historical sales data. The model has been performing well, but recently the forecasts have been consistently too low, leading to stockouts. The data scientist notices that the model was trained on data up to last year, and the company has since launched a successful marketing campaign that increased sales by 20%. The data scientist needs to update the model to reflect the new sales patterns. Which approach should the data scientist take?

A.Add a feature for the marketing campaign and continue using the old model.

B.Switch to a different model type, such as ARIMA, without retraining.

C.Multiply the model's predictions by 1.2 to account for the marketing campaign.

D.Retrain the model on the most recent data that includes the sales from the marketing campaign.

AnswerD

Retraining on recent data allows the model to learn the new sales pattern and improve forecasts.

Why this answer

Option D is correct because retraining the model on the most recent data that includes the sales from the marketing campaign allows the model to learn the new underlying pattern in the time series. Since the model is a time series model, it relies on historical patterns to make forecasts; retraining on data that captures the 20% sales lift ensures the model adapts to the new demand level, reducing the persistent underforecasting.

Exam trap

The trap here is that candidates may think a simple multiplicative adjustment (Option C) is sufficient, but Cisco tests the understanding that time series models must be retrained on the new distribution to maintain forecast accuracy, as static adjustments ignore changes in the underlying data-generating process.

How to eliminate wrong answers

Option A is wrong because simply adding a feature for the marketing campaign without retraining the model does not update the model's learned parameters; the old model's weights are still based on pre-campaign data, so it cannot properly incorporate the new pattern. Option B is wrong because switching to a different model type like ARIMA without retraining means the new model has no learned parameters from the data, and ARIMA requires fitting to the specific time series to estimate its autoregressive and moving average components. Option C is wrong because multiplying predictions by a constant factor (1.2) assumes a uniform multiplicative effect that may not hold across all products or time periods, and it does not address potential changes in seasonality, trend, or other dynamics introduced by the campaign.

Full explanation →

182

Multi-Selecthard

A data scientist is fine-tuning a foundation model on SageMaker. They want to prevent overfitting. Which THREE actions can help? (Select THREE.)

Select 3 answers

A.Apply dropout

B.Increase training data size

C.Increase the number of epochs

D.Use early stopping

E.Use a smaller learning rate

AnswersA, B, D

Dropout prevents co-adaptation of neurons.

Why this answer

Option A is correct because dropout is a regularization technique that randomly drops a fraction of neurons during training, which prevents the model from relying too heavily on specific features and reduces overfitting. In SageMaker, dropout can be applied via framework-specific APIs (e.g., `tf.keras.layers.Dropout` in TensorFlow) or by configuring the model architecture in the training script.

Exam trap

AWS often tests the misconception that increasing epochs or using a smaller learning rate directly prevents overfitting, when in fact these are hyperparameter tuning strategies that can exacerbate or fail to address overfitting without explicit regularization.

Full explanation →

183

MCQhard

Refer to the exhibit. A team is configuring a SageMaker Model Bias job. The baseline job has been completed. However, the bias job fails with a resource not found error. What is the most likely cause?

A.The StoppingCondition is too short

B.The BaseliningJobName is incorrect

C.The instance type ml.m5.large is not supported

D.The IAM role lacks permissions to DescribeBaselineJob

AnswerB

Typo or mismatch in the baseline job name.

Why this answer

Option C is correct: The BaseliningJobName must exactly match the name of the completed baseline job. Option A is wrong: IAM permissions would give a different error. Option B is wrong: ml.m5.large is supported.

Option D is wrong: Runtime is sufficient.

Full explanation →

184

MCQhard

A financial services company uses a foundation model for document analysis. They need to ensure the model does not output sensitive customer information from its training data. What is the most effective mitigation?

A.Implement output filtering using an external service

B.Choose a model that has been fine-tuned on financial data

C.Apply data masking before sending input

D.Use a private endpoint

AnswerA

Output filtering can scan and block responses containing sensitive data.

Why this answer

Output filtering using an external service is the most effective mitigation because it acts as a post-processing layer that can detect and redact sensitive customer information (e.g., PII, account numbers) before the model's response is returned to the user. This approach does not rely on the model's internal training or input modifications, which can be bypassed or incomplete. It provides a robust, policy-driven control that can be updated independently of the model.

Exam trap

The trap here is that candidates often confuse input-side controls (like data masking or fine-tuning) with output-side controls, assuming that protecting the input or training the model on domain data is sufficient to prevent leakage of memorized sensitive information.

How to eliminate wrong answers

Option B is wrong because fine-tuning on financial data does not guarantee the model will not memorize or regurgitate sensitive customer information from its original training data; fine-tuning adjusts the model's behavior but does not erase existing memorized data. Option C is wrong because data masking before sending input only protects the input data, not the model's outputs; the model could still output sensitive information from its training data that was never masked. Option D is wrong because using a private endpoint secures the network connection and access control but does not prevent the model from generating outputs containing sensitive training data; it addresses data-in-transit security, not output content safety.

Full explanation →

185

MCQhard

A company is building a multi-step AI agent using Amazon Bedrock Agents to automate a complex business process that requires memory across interactions. The agent needs to remember user preferences and previous steps. Which approach best maintains state across sessions?

A.Store state in AWS Lambda environment variables

B.Use Amazon DynamoDB tables managed by Lambda functions

C.Implement state management using AWS Step Functions

D.Leverage Amazon Bedrock Agents' built-in memory feature

AnswerD

Bedrock Agents provide session memory out-of-the-box, automatically persisting context across turns.

Why this answer

Amazon Bedrock Agents have built-in memory management (session memory) that automatically persists context across interactions, using a backend like DynamoDB. Lambda would require custom state management. Step Functions orchestrate but don't inherently store memory.

DynamoDB alone lacks the agent logic.

Full explanation →

186

MCQeasy

A financial institution uses a machine learning model to approve loan applications. The model is trained on historical data that includes biased lending practices. What is the most effective first step to address potential bias?

A.Immediately deploy the model and monitor for biased outcomes

B.Retrain the model with synthetic data generated from the original dataset

C.Remove all demographic features from the model

D.Audit the training data for bias and review feature selection

AnswerD

Auditing data and features is the foundational step to identify and mitigate bias.

Why this answer

Before deploying, it is critical to evaluate the training data for bias. Auditing the data for representation and fairness helps identify and mitigate bias early. Post-deployment monitoring is secondary, and simply retraining without review may perpetuate bias.

Excluding demographic data might ignore important fairness dimensions.

Full explanation →

187

MCQmedium

A company is building a customer support chatbot using Amazon Bedrock. They have a large corpus of internal documentation and want to provide accurate answers without retraining the model. Which approach should they use?

A.Use advanced prompt engineering with a generic foundation model.

B.Use Retrieval Augmented Generation (RAG) with Amazon Bedrock Knowledge Base.

C.Use a pre-trained foundation model without customization.

D.Fine-tune the model on their documentation using Amazon SageMaker.

AnswerB

RAG retrieves relevant documents at inference time, providing accurate answers from internal data without retraining.

Why this answer

Option C is correct because Retrieval Augmented Generation (RAG) with Bedrock Knowledge Base allows the model to retrieve relevant documents from internal sources and generate grounded answers, avoiding the need for fine-tuning. Option A is wrong because a pre-trained model alone lacks domain knowledge. Option B is wrong because fine-tuning requires labeled data and is more costly.

Option D is wrong because prompt engineering alone cannot incorporate proprietary data effectively.

Full explanation →

188

MCQeasy

A company is using Amazon Bedrock to deploy a foundation model. To comply with GDPR, they need to ensure that the model does not generate outputs containing personally identifiable information (PII). Which AWS service can best help detect and redact PII from the model's responses?

A.Amazon Comprehend

B.Amazon GuardDuty

C.Amazon Rekognition

D.AWS WAF

AnswerA

Amazon Comprehend has PII detection and redaction features.

Why this answer

Amazon Comprehend is the correct service because it provides a built-in PII detection and redaction capability that can be integrated with Amazon Bedrock. Using the `DetectPiiEntities` API, you can scan model responses for PII such as names, addresses, and credit card numbers, and then redact or mask those entities before returning the output to the user. This directly addresses the GDPR requirement to prevent PII leakage from generative AI outputs.

Exam trap

Cisco often tests the distinction between security monitoring services (GuardDuty, WAF) and content analysis services (Comprehend), leading candidates to mistakenly choose a network-level security tool for a data content problem.

How to eliminate wrong answers

Option B is wrong because Amazon GuardDuty is a threat detection service that monitors for malicious activity and unauthorized behavior in AWS accounts and workloads, not a service for detecting or redacting PII in text content. Option C is wrong because Amazon Rekognition is an image and video analysis service that can detect faces, objects, and text in media, but it does not provide PII detection or redaction for text-based model responses. Option D is wrong because AWS WAF is a web application firewall that protects against common web exploits like SQL injection and cross-site scripting, and it has no capability to analyze or redact PII from AI model outputs.

Full explanation →

189

MCQhard

A data scientist wants to perform automatic model tuning (hyperparameter optimization) on SageMaker. They need to find the best hyperparameters for a gradient boosting model. Which strategy is BEST for this task?

A.Random search

B.Grid search

C.Exhaustive search

D.Bayesian optimization

AnswerD

Uses a probabilistic model to select hyperparameters, achieving better results with fewer iterations.

Why this answer

Bayesian optimization is the best strategy for automatic model tuning on SageMaker because it builds a probabilistic model of the objective function and uses it to select the most promising hyperparameters to evaluate next. This approach is far more sample-efficient than random or grid search, making it ideal for expensive-to-evaluate models like gradient boosting, where each training run consumes significant time and compute resources.

Exam trap

Cisco often tests the misconception that exhaustive or grid search is the most thorough and therefore best approach, but the trap is that they ignore the practical constraints of compute cost and time, making Bayesian optimization the superior choice for automatic model tuning in SageMaker.

How to eliminate wrong answers

Option A is wrong because random search, while better than grid search for high-dimensional spaces, does not use past evaluation results to inform future trials, making it less efficient than Bayesian optimization for finding optimal hyperparameters. Option B is wrong because grid search exhaustively evaluates all combinations of a predefined set of hyperparameter values, which is computationally prohibitive for gradient boosting models with many continuous hyperparameters and does not scale well. Option C is wrong because exhaustive search is essentially a synonym for grid search and suffers from the same curse of dimensionality, making it impractical for hyperparameter optimization in SageMaker's automatic model tuning context.

Full explanation →

190

MCQhard

A startup is fine-tuning a large language model (LLM) for code generation using Amazon SageMaker. They are using a p4d.24xlarge instance with a single GPU. The training process is extremely slow, taking over 48 hours for one epoch. The dataset is 10GB of code snippets. The company needs to iterate quickly. Which action would most significantly reduce training time without sacrificing model quality?

A.Enable distributed training using SageMaker’s data parallelism library across multiple GPUs

B.Switch to spot instances to reduce cost, not time

C.Increase the batch size to use GPU memory more efficiently

D.Use a smaller foundation model to reduce compute per step

AnswerA

Distributed training scales across GPUs/nodes, significantly speeding up training while preserving model size.

Why this answer

Distributed training across multiple GPUs and instances dramatically reduces time by parallelizing the workload. Increasing instance count or using a smaller model helps but may not be optimal. Spot instances could be unstable.

Data parallelism is a standard technique for large models.

Full explanation →

191

Multi-Selecthard

Which THREE are best practices for ensuring generated content complies with corporate brand guidelines when using Amazon Bedrock?

Select 3 answers

A.Implement guardrails to restrict tone, topics, and language

B.Use prompt engineering to specify brand voice and style

C.Increase the temperature for more creative outputs

D.Use random prompts to test variability

E.Fine-tune the model on a dataset of brand-compliant content

AnswersA, B, E

Guardrails enforce content policies at inference time.

Why this answer

Option A is correct because Amazon Bedrock Guardrails allow you to define policies that restrict the model's output to specific tones, topics, and language, ensuring alignment with corporate brand guidelines. By configuring denied topics and content filters, you can prevent the model from generating off-brand or inappropriate content, directly enforcing compliance at the inference layer.

Exam trap

AWS often tests the misconception that increasing temperature or using random prompts can help enforce brand guidelines, when in fact these actions increase variability and reduce control, directly opposing the goal of compliance.

Full explanation →

192

Multi-Selecteasy

Which TWO of the following are key advantages of using Amazon Bedrock for building generative AI applications?

Select 2 answers

A.Automatic optimization of prompts for all models without user intervention.

B.Ability to fine-tune models using your own data without managing underlying infrastructure.

C.Eliminates the need for any data preprocessing before model invocation.

D.Guaranteed identical outputs from all models for the same prompt.

E.Access to multiple foundation models from different providers via a single API.

AnswersB, E

Correct. Bedrock provides managed fine-tuning capabilities, abstracting infrastructure.

Why this answer

Amazon Bedrock provides access to multiple foundation models from different providers via a single API, and it allows you to fine-tune models using your own data without managing infrastructure. Automatic prompt optimization is not a built-in feature; model outputs are not guaranteed to be identical; and data preprocessing is still required.

Full explanation →

193

MCQmedium

A company fine-tunes a foundation model on SageMaker using a custom dataset. They notice the training job takes too long. Which optimization technique is specifically designed to reduce training time for foundation models?

A.Distributed training using SageMaker Data Parallelism

B.Using a smaller instance type

C.Using Spot Instances

D.Reducing batch size

AnswerA

Data parallelism partitions the data and trains across multiple devices, reducing wall-clock time.

Why this answer

SageMaker Data Parallelism distributes the training workload across multiple GPUs or instances, splitting the data and synchronizing gradients using optimized all-reduce algorithms. This specifically reduces training time for large foundation models by enabling parallel computation, which is the most direct technique for accelerating training at scale.

Exam trap

AWS often tests the misconception that cost-saving techniques like Spot Instances or smaller instances also improve performance, but the question specifically asks for optimization to reduce training time, not cost.

How to eliminate wrong answers

Option B is wrong because using a smaller instance type reduces computational capacity, which would increase training time rather than reduce it. Option C is wrong because Spot Instances reduce cost by using spare AWS capacity, but they do not inherently speed up training; they may even cause interruptions that prolong total time. Option D is wrong because reducing batch size can actually slow convergence and increase the number of training steps, potentially increasing overall training time.

Full explanation →

194

Multi-Selectmedium

A company is using Amazon Bedrock to generate marketing content. They want to evaluate the quality of the generated text. Which TWO metrics are most appropriate for evaluating text quality?

Select 2 answers

A.Precision

B.Perplexity

C.Accuracy

D.F1 score

E.BLEU (Bilingual Evaluation Understudy)

AnswersB, E

Perplexity measures how well the model predicts the text.

Why this answer

Perplexity measures how well a language model predicts a sample, with lower values indicating higher confidence and coherence in generated text. BLEU evaluates the overlap between generated text and reference text, making it suitable for assessing fluency and relevance in content generation tasks like marketing copy.

Exam trap

AWS often tests the distinction between classification metrics (precision, accuracy, F1) and generation evaluation metrics (perplexity, BLEU), leading candidates to mistakenly apply classification concepts to text quality assessment.

Full explanation →

195

Multi-Selectmedium

A data science team is evaluating foundation models for a code generation task. They need a model that is fine-tuned for code and can be deployed on Amazon SageMaker. Which THREE criteria are important to consider when selecting a model?

Select 3 answers

A.Licensing and usage terms

B.Cost per token for inference

C.Context window length

D.The training algorithm used

E.Model size and architecture

AnswersA, C, E

Must be compatible with the deployment and business use.

Why this answer

Option A is correct because licensing and usage terms directly impact whether a foundation model can be legally used for commercial code generation. Models like Code Llama or StarCoder have specific licenses (e.g., Llama 2 Community License, OpenRAIL-M) that may restrict fine-tuning, redistribution, or use in proprietary products. Ignoring these terms could lead to compliance violations when deploying on Amazon SageMaker.

Exam trap

Cisco often tests the distinction between model selection criteria (licensing, context window, architecture) and operational concerns (cost, training internals), tempting candidates to pick cost per token or training algorithm as relevant when they are not primary factors for selecting a fine-tuned model for deployment.

Full explanation →

196

MCQhard

A financial services company needs to deploy a real-time fraud detection model with sub-100ms inference latency. The model is a large ensemble requiring 8 GB of memory per request. The workload has bursty traffic. Which Amazon SageMaker deployment strategy best meets these requirements?

A.Deploy behind an Application Load Balancer with multiple ml.m5.xlarge EC2 instances running the model

B.Use a single ml.r5.2xlarge instance with an auto-scaling policy based on CPU utilization

C.Use a SageMaker multi-model endpoint with ml.m5.large instances to cache multiple models

D.Use SageMaker asynchronous inference with a large batch size

AnswerB

A real-time endpoint with a large instance and auto-scaling handles bursty traffic and meets latency requirements.

Why this answer

Option B is correct because a single ml.r5.2xlarge instance provides 16 GB of memory, which can handle the 8 GB per request requirement, and SageMaker real-time endpoints with auto-scaling based on CPU utilization can dynamically adjust to bursty traffic while maintaining sub-100ms inference latency. This approach avoids the overhead of load balancers or multi-model caching that could introduce latency.

Exam trap

The trap here is that candidates may assume multi-model endpoints (Option C) are suitable for large models, but they are designed for many small models sharing memory, not for a single large ensemble requiring 8 GB per request.

How to eliminate wrong answers

Option A is wrong because deploying behind an Application Load Balancer with multiple ml.m5.xlarge instances adds network hop latency and does not leverage SageMaker's native endpoint routing, potentially exceeding the sub-100ms requirement; also, m5.xlarge instances have only 8 GB of memory, which may not handle the 8 GB per request without memory contention. Option C is wrong because SageMaker multi-model endpoints are designed for serving multiple smaller models from a shared instance, not for a single large ensemble requiring 8 GB per request, and ml.m5.large instances have only 4 GB of memory, insufficient for the workload. Option D is wrong because SageMaker asynchronous inference is intended for non-real-time, large payloads with minutes of latency, not sub-100ms real-time fraud detection, and batching would increase latency beyond the requirement.

Full explanation →

197

MCQhard

A developer sends the above request to Amazon Bedrock with Anthropic Claude. The model returns a response that stops before reaching 500 tokens. What is the most likely reason?

A.The temperature is set too high

B.The model is not trained on this topic

C.The model reached a stop sequence

D.The token limit is exceeded

AnswerC

The model can stop early when it identifies a natural endpoint.

Why this answer

The model stopped before reaching 500 tokens because the request likely included a stop sequence (e.g., `\n\nHuman:` or a custom stop token) that matched the generated output. When a stop sequence is encountered, Bedrock immediately halts generation, even if the token limit has not been reached. This is the most direct explanation for a premature stop.

Exam trap

AWS often tests the distinction between a stop sequence and a token limit; the trap here is that candidates confuse a premature stop with exceeding the token limit, but a stop sequence causes an early halt while a token limit would cause truncation at the limit.

How to eliminate wrong answers

Option A is wrong because a high temperature increases randomness and can cause the model to generate more tokens or diverge, not stop early. Option B is wrong because Bedrock's Claude models are trained on a broad corpus and can generate responses on any topic; lack of training would produce low-quality or repetitive text, not a stop before the token limit. Option D is wrong because if the token limit were exceeded, the model would truncate the response at the limit, not stop before reaching it.

Full explanation →

198

MCQhard

Refer to the exhibit. A user tries to create a training job that reads data from a bucket named 'my-bucket'. The job fails with an access denied error. What is the most likely cause?

A.The s3:GetObject action is restricted to a specific bucket but the bucket ARN is incorrect

B.The sagemaker:CreateTrainingJob action is not allowed for the specific instance type

C.The s3:GetObject permission is missing the bucket ARN for listing objects

D.The IAM role does not have permission to write to the bucket

AnswerC

Without s3:ListBucket permission, the training job may fail when trying to list objects at the bucket level.

Why this answer

Option C is correct because the error 'access denied' when reading data from S3 typically indicates that the IAM role used by SageMaker lacks the necessary permissions. Specifically, the s3:GetObject action requires the bucket ARN in the resource element of the policy to allow listing and retrieving objects. Without the bucket ARN, the permission is incomplete, leading to the access denied error.

Exam trap

Cisco often tests the distinction between missing permissions (s3:GetObject) and incorrect ARN formatting, leading candidates to mistakenly choose Option A when the real issue is a missing bucket ARN in the resource element.

How to eliminate wrong answers

Option A is wrong because the s3:GetObject action being restricted to a specific bucket with an incorrect ARN would cause a different error (e.g., invalid ARN), not an access denied error; the issue is missing permissions, not incorrect ARN. Option B is wrong because the sagemaker:CreateTrainingJob action is not restricted by instance type in IAM policies; instance type restrictions are handled by service quotas or account limits, not IAM permissions. Option D is wrong because the error is about reading data (s3:GetObject), not writing; the IAM role does not need write permission to read from the bucket.

Full explanation →

199

MCQhard

A company deploys a deep learning model for image classification using Amazon SageMaker. They are concerned about adversarial attacks that could misclassify images with small perturbations. Which of the following is the most effective approach to improve model robustness?

A.Reduce training data size

B.Use early stopping during training

C.Apply adversarial training

D.Increase model complexity

AnswerC

Adversarial training includes perturbed examples in the training set, teaching the model to resist small changes.

Why this answer

Adversarial training incorporates adversarial examples during training, which is the most proven method to improve robustness against such attacks. Increasing model complexity, early stopping, or reducing data are not effective or may harm performance.

Full explanation →

200

MCQhard

A developer is using Amazon Bedrock's Converse API to build a multi-turn conversation. They notice the model forgets earlier context after a few exchanges. What is the most likely cause?

A.The API has a rate limit that truncates history

B.The model's context window is too small for the conversation

C.The model's maximum output length is set too low

D.The developer is not sending the previous messages in each request

AnswerD

The Converse API requires the client to maintain and send conversation history.

Why this answer

The API requires passing the message history explicitly; if not, context is lost. Option A (Rate limits) affect throughput. Option B (Model capacity) is not typical.

Option D (Output length) truncates responses.

Full explanation →

201

Multi-Selecteasy

Which TWO techniques provide interpretability for machine learning models at a local (per-prediction) level? (Choose two.)

Select 2 answers

A.SHAP values

B.Partial dependence plots

C.Confusion matrix

D.LIME

E.Permutation feature importance

AnswersA, D

SHAP provides local explanations based on Shapley values.

Why this answer

Options A and C are correct. LIME and SHAP are local interpretability methods. Option B (Permutation importance) is global.

Option D (Partial dependence) is global. Option E (Confusion matrix) is a performance metric, not interpretability.

Full explanation →

202

MCQmedium

A company wants to build a customer service chatbot using a foundation model. The chatbot must respond in under 2 seconds and handle high throughput. Which model deployment option should they choose?

A.Amazon Bedrock on-demand

B.Amazon SageMaker real-time endpoint hosting a foundation model

C.Amazon Lex with a pre-built foundation model

D.Amazon Bedrock provisioned throughput

AnswerD

Provisioned throughput reserves capacity and ensures consistent low latency.

Why this answer

Option B is correct because Amazon Bedrock provisioned throughput guarantees a set number of tokens per minute with low latency, meeting the response time requirement. Option A (on-demand) may have cold starts. Option C (SageMaker real-time) may not be optimized for FMs and requires more management.

Option D (Lex with pre-built FM) may not have the required flexibility.

Full explanation →

203

MCQhard

A company uses Amazon SageMaker for model training. To comply with data residency requirements, they must ensure that the training data never leaves a specific AWS region. However, during training, the SageMaker service might use resources in other regions for auto-scaling. Which configuration should they use to enforce data residency?

A.Configure the training job to use only local spot instances and enable network isolation.

B.Use Amazon SageMaker's inter-container traffic encryption and disable cross-region data transfer.

C.Use AWS Organizations to create an SCP that denies access to SageMaker resources in other regions.

D.Use a VPC with a VPC endpoint for SageMaker and restrict the training job to use only local resources.

AnswerC

SCPs can explicitly deny SageMaker actions in non-compliant regions, enforcing data residency.

Why this answer

Option C is correct because AWS Organizations Service Control Policies (SCPs) can explicitly deny access to SageMaker resources in any region outside the allowed one. By attaching an SCP that denies `sagemaker:*` actions when the `aws:RequestedRegion` condition key does not match the permitted region, the company can enforce data residency at the account level, preventing SageMaker from provisioning resources in other regions even if auto-scaling would otherwise trigger cross-region activity.

Exam trap

The trap here is that candidates often assume VPC endpoints or network isolation are sufficient to enforce regional boundaries, but they do not control the SageMaker control plane's ability to launch resources in other regions; only an SCP or IAM policy with a region condition can enforce that restriction at the API level.

How to eliminate wrong answers

Option A is wrong because using local spot instances and network isolation only restricts the instance type and network access; it does not prevent SageMaker from launching training resources in other regions for auto-scaling or data processing. Option B is wrong because inter-container traffic encryption secures data in transit between containers but does not control the geographic location of the compute resources; disabling cross-region data transfer is not a configurable SageMaker setting. Option D is wrong because a VPC with a VPC endpoint for SageMaker restricts network traffic to the service endpoint within the VPC, but SageMaker can still launch training jobs in other regions if the training job configuration or service backend decides to use resources outside the local region; the VPC endpoint does not enforce regional boundaries on SageMaker's resource provisioning.

Full explanation →

204

MCQmedium

A company is using Amazon Textract to extract text from scanned documents stored in an S3 bucket. The security team requires that all access to the documents be logged and that the documents be encrypted at rest using a customer-managed key. What should the company do to meet these requirements?

A.Use S3 default encryption and enable Textract logging

B.Enable S3 server-side encryption with AWS KMS (SSE-KMS) and enable CloudTrail data events for the S3 bucket

C.Enable S3 server access logs and use S3 SSE-KMS

D.Use S3 server-side encryption with S3-managed keys (SSE-S3) and enable S3 access logs

AnswerB

SSE-KMS provides encryption with customer-managed keys; CloudTrail data events log access to objects.

Why this answer

Option B is correct because enabling S3 server-side encryption with AWS KMS (SSE-KMS) satisfies the requirement for encryption at rest using a customer-managed key, and enabling CloudTrail data events for the S3 bucket captures all access to the documents (including GetObject, PutObject, etc.) for logging. This combination meets both security requirements precisely.

Exam trap

The trap here is that candidates often confuse S3 server access logs with CloudTrail data events, assuming both provide equivalent logging, but only CloudTrail data events offer reliable, real-time, and comprehensive object-level access logging required for security audits.

How to eliminate wrong answers

Option A is wrong because S3 default encryption uses either SSE-S3 or SSE-KMS, but it does not specify customer-managed keys, and enabling Textract logging only logs Textract API calls, not S3 data access events. Option C is wrong because S3 server access logs provide access logs but are delivered on a best-effort basis with potential delays and do not capture all API-level details like CloudTrail data events; also, while SSE-KMS is used, the logging mechanism is insufficient for comprehensive audit requirements. Option D is wrong because SSE-S3 uses AWS-managed keys, not customer-managed keys, and S3 access logs are not as granular or reliable as CloudTrail data events for logging all access.

Full explanation →

205

Multi-Selectmedium

Which THREE are key capabilities of Amazon Bedrock? (Choose 3)

Select 3 answers

A.Automatic model selection based on use case

B.Model customization through fine-tuning

C.Guardrails to filter harmful content

D.Serverless inference for foundation models

E.Built-in vector database for knowledge bases

AnswersB, C, D

Bedrock supports fine-tuning for Amazon Titan and other models.

Why this answer

Bedrock offers serverless inference (A), model customization (B), and guardrails for content filtering (D). Built-in vector database (C) is not a Bedrock capability; Bedrock integrates with external vector stores. Auto model selection (E) is not available; users choose models explicitly.

Full explanation →

206

Multi-Selecthard

A data science team is fine-tuning a foundation model on Amazon SageMaker. Which THREE steps are part of the best practice? (Choose three.)

Select 3 answers

A.Increase model size to improve performance.

B.Monitor for catastrophic forgetting during fine-tuning.

C.Use early stopping to prevent overfitting.

D.Deploy the model to production immediately after fine-tuning.

E.Use a diverse dataset representing various scenarios.

AnswersB, C, E

Catastrophic forgetting can cause loss of original capabilities; monitoring helps adjust training.

Why this answer

Option B is correct because catastrophic forgetting is a known risk when fine-tuning foundation models, where the model loses previously learned knowledge while adapting to new data. Monitoring for this during fine-tuning on SageMaker allows the team to detect performance degradation on the original task and adjust training accordingly, ensuring the model retains its general capabilities.

Exam trap

AWS often tests the misconception that fine-tuning always requires a larger model or immediate deployment, while the real best practices focus on validation, monitoring, and data diversity to maintain model robustness.

Full explanation →

207

MCQmedium

An e-commerce company uses Amazon Bedrock to generate product descriptions. They notice the descriptions are too long and contain repetitive phrases. Which parameter adjustment can help?

A.Increase frequency penalty

B.Increase temperature

C.Increase top_p

D.Decrease presence penalty

AnswerA

Frequency penalty reduces the likelihood of repeating tokens.

Why this answer

Increasing the frequency penalty reduces the likelihood of the model repeating the same phrases or tokens, directly addressing the issue of repetitive language in generated product descriptions. This parameter penalizes tokens that have already appeared in the text, encouraging more diverse output and naturally shortening overly long descriptions by avoiding redundant loops.

Exam trap

AWS often tests the distinction between frequency penalty and presence penalty, where candidates confuse 'penalizing repetition' with 'reducing randomness' and incorrectly choose temperature or top_p adjustments.

How to eliminate wrong answers

Option B is wrong because increasing temperature makes the model more random and creative, which could actually worsen verbosity and introduce more irrelevant phrases rather than reducing repetition. Option C is wrong because increasing top_p (nucleus sampling) expands the set of possible next tokens, which may increase diversity but does not specifically penalize repeated tokens and can still produce long, repetitive text. Option D is wrong because decreasing presence penalty would reduce the penalty for tokens that have already appeared, making the model more likely to repeat itself, which is the opposite of what is needed.

Full explanation →

208

MCQmedium

A research organization is using Amazon SageMaker Studio to collaborate on building machine learning models. The security policy requires that all data and code remain within a VPC and cannot be accessed from the public internet. Additionally, the organization wants to enforce that only approved base images are used for the Studio environment. How should the organization configure SageMaker Studio to meet these requirements?

A.Use an AWS Transit Gateway to connect the VPC, and enforce HTTPS for all traffic.

B.Configure the Studio domain in VPC Only mode and use a SageMaker Studio lifecycle configuration to restrict the list of available base images to those in a private Amazon ECR repository.

C.Configure the Studio domain to disable direct internet access, and let users choose any base image from the SageMaker public registry.

D.Use AWS CloudFormation to create a Studio domain in a VPC and rely on individuals to use approved images.

AnswerB

VPC Only ensures no public internet access, and lifecycle configurations can restrict images to approved ones from a private ECR.

Why this answer

Option B is correct because configuring the SageMaker Studio domain in VPC Only mode ensures that all data and code remain within the VPC and cannot be accessed from the public internet. Additionally, using a lifecycle configuration script to restrict available base images to those in a private Amazon ECR repository enforces the policy that only approved base images are used, as the lifecycle configuration can modify the Jupyter Server settings to limit the image registry.

Exam trap

The trap here is that candidates often confuse disabling direct internet access with fully restricting image choices, not realizing that without a lifecycle configuration to filter the image registry, users can still select any public SageMaker image that is available within the VPC's network scope.

How to eliminate wrong answers

Option A is wrong because AWS Transit Gateway is used to connect multiple VPCs or on-premises networks, not to enforce VPC-only access for SageMaker Studio, and enforcing HTTPS for all traffic does not restrict base images or prevent public internet access. Option C is wrong because disabling direct internet access alone does not restrict users from choosing any base image from the SageMaker public registry; it only prevents outbound internet traffic, but users could still select public images that are cached or accessible within the VPC. Option D is wrong because using AWS CloudFormation to create a Studio domain in a VPC does not enforce the use of approved images; relying on individuals to use approved images is a manual process that does not meet the security policy requirement.

Full explanation →

209

MCQmedium

A company runs a chatbot using a large language model on Amazon Bedrock. They notice high latency during peak hours. Which action would be MOST effective to reduce latency without degrading response quality?

A.Increase the number of concurrent invocations

B.Switch to a smaller model

C.Decrease the maxTokens parameter

D.Use Provisioned Throughput for model inference

AnswerD

Provisioned Throughput ensures reserved capacity, reducing latency variability.

Why this answer

Provisioned Throughput on Amazon Bedrock reserves dedicated capacity for model inference, ensuring consistent low latency even during peak hours. This eliminates the variability caused by resource contention in the on-demand tier, directly addressing high latency without altering model size or output quality.

Exam trap

AWS often tests the misconception that reducing model size or output length is the primary way to reduce latency, but the real bottleneck in peak-hour scenarios is often infrastructure contention, which Provisioned Throughput resolves without sacrificing quality.

How to eliminate wrong answers

Option A is wrong because increasing concurrent invocations without dedicated capacity can exacerbate resource contention, leading to throttling and higher latency. Option B is wrong because switching to a smaller model reduces response quality (e.g., lower accuracy or coherence), which degrades the chatbot's performance. Option C is wrong because decreasing maxTokens truncates responses, degrading output quality by cutting off reasoning or context, and does not address the root cause of latency from infrastructure contention.

Full explanation →

210

MCQhard

An insurance company uses a machine learning model to adjust premiums. During a review, the model is found to be penalizing customers based on zip codes correlated with racial demographics, leading to potential discrimination. Which combination of actions best addresses this fairness issue while maintaining business value?

A.Remove the zip code feature from the model and retrain

B.Re-engineer features to avoid proxies for protected attributes and rebalance training data

C.Continue using the model but add a disclaimer about potential bias

D.Replace the model with a simpler linear model

AnswerB

This addresses both the features and the data representation, mitigating bias comprehensively.

Why this answer

The most effective approach is to re-engineer features to remove proxy variables for protected attributes (like race) AND to rebalance the training data to ensure fair representation. Simply removing zip code may not eliminate proxies, and ignoring the issue is not acceptable. Using a different model without data changes may not solve the problem.

Full explanation →

211

MCQeasy

Which AWS service provides a fully managed experience for building generative AI applications with a variety of foundation models through a unified API?

A.AWS Lambda

B.Amazon SageMaker

C.Amazon Rekognition

D.Amazon Bedrock

AnswerD

Amazon Bedrock provides a unified API to access and manage foundation models from various providers.

Why this answer

Option D is correct because Amazon Bedrock is a fully managed service that offers a choice of foundation models from Amazon and other providers via a single API. Option A (SageMaker) is a broader ML platform. Option B (Lambda) is serverless compute.

Option C (Rekognition) is for image/video analysis.

Full explanation →

212

MCQeasy

A developer is using Amazon Bedrock to build a chatbot that answers customer queries. The chatbot must only respond based on the provided company documentation. Which approach best meets this requirement?

A.Use prompt engineering to instruct the model to only use documentation.

B.Use a RAG architecture with the company documentation as the knowledge base.

C.Fine-tune a foundation model on the company documentation.

D.Use a text classification model to filter responses.

AnswerB

RAG ensures responses are based on retrieved documents.

Why this answer

Option B is correct because Retrieval-Augmented Generation (RAG) architecture retrieves relevant chunks from the company documentation at query time and injects them into the prompt, ensuring the model's response is grounded solely in the provided documents. This approach prevents the model from relying on its internal training data or generating information outside the documentation, which is critical for a closed-domain chatbot.

Exam trap

Cisco often tests the distinction between prompt engineering and RAG, where candidates mistakenly believe a well-crafted prompt can fully control model behavior without a retrieval mechanism, overlooking the fact that foundation models inherently generate responses from their training data unless explicitly grounded via external knowledge retrieval.

How to eliminate wrong answers

Option A is wrong because prompt engineering alone cannot guarantee the model will ignore its pre-trained knowledge; the model may still hallucinate or use information not present in the documentation, as it has no mechanism to enforce retrieval of specific content. Option C is wrong because fine-tuning a foundation model on the company documentation embeds the data into the model's weights, which can lead to outdated or incomplete responses and does not allow dynamic retrieval of the latest documentation; it also risks overfitting and does not scale well with changing content. Option D is wrong because a text classification model filters responses after generation, but it cannot ensure the response is based on the documentation; it only labels or rejects outputs, which is insufficient for generating accurate, document-grounded answers.

Full explanation →

213

MCQhard

A team is using Amazon Bedrock with a Claude model and wants to ensure responses adhere to a specific output format such as JSON. Which technique should be applied?

A.Use a retrieval-augmented generation (RAG) approach

B.Attach a guardrail with a JSON schema

C.Include a system prompt with explicit formatting instructions

D.Customize the model with a JSON training dataset

AnswerC

A system prompt can instruct the model to output JSON or other formats.

Why this answer

System prompt instructions can enforce output format. Option A (Guardrails) is for safety. Option B (Retrieval augmentation) is for knowledge.

Option D (Model customization) is for fine-tuning.

Full explanation →

214

MCQeasy

A company wants to generate product descriptions from a few keywords without managing infrastructure. Which AWS service provides a serverless API for accessing foundation models?

A.Amazon Lex

B.Amazon SageMaker

C.Amazon Bedrock

D.Amazon Comprehend

AnswerC

Bedrock provides a serverless API to invoke foundation models directly.

Why this answer

Amazon Bedrock offers a serverless API to invoke foundation models. Option A (SageMaker) is for training and deployment, not serverless API. Option C (Comprehend) is for NLP analysis, not generative.

Option D (Lex) is for conversational interfaces.

Full explanation →

215

MCQhard

A company is building a chatbot that must provide accurate answers based on internal documents without retraining the model. Which approach should they use?

A.Reinforcement learning from human feedback (RLHF)

B.Fine-tuning the model on internal documents

C.Model distillation to a smaller model

D.Prompt engineering with retrieval-augmented generation (RAG)

AnswerD

RAG retrieves relevant documents at inference time, providing up-to-date answers.

Why this answer

RAG (Retrieval-Augmented Generation) retrieves relevant documents and augments the prompt, avoiding retraining. Option A (Fine-tuning) requires retraining. Option C (Distillation) reduces model size.

Option D (RLHF) is for aligning model behavior.

Full explanation →

216

MCQhard

A research lab uses Amazon SageMaker to train a deep learning model for medical diagnosis. They need to ensure the model's decisions are interpretable to clinicians. Which SageMaker feature provides local and global feature importance?

A.SageMaker Model Monitor

B.SageMaker Experiments

C.SageMaker Clarify

D.SageMaker Debugger

AnswerC

Clarify provides explainability metrics.

Why this answer

Option D is correct: SageMaker Clarify provides SHAP-based explanations for feature importance. Option A is wrong: Debugger is for debugging training. Option B is wrong: Model Monitor is for monitoring.

Option C is wrong: Experiments is for managing trials.

Full explanation →

217

MCQhard

A company operates a customer service platform that uses Amazon Bedrock with a foundation model to generate automated responses. The system has been in production for three months. Recently, customers have reported that responses are becoming repetitive and less relevant over time. The development team notices that the model's performance has degraded, especially for queries about newer products that were added after the initial deployment. The team currently uses a static prompt with a fixed knowledge base that was set up at launch. The model is invoked via the Bedrock API with standard settings. The team wants to improve response quality without incurring high costs or extensive re-engineering. What should the team do?

A.Increase the temperature parameter to 0.9 to introduce more randomness and reduce repetition.

B.Fine-tune the model every week on the latest customer interactions using Amazon SageMaker.

C.Switch to a larger foundation model to handle the increased complexity of new products.

D.Implement a feedback loop to periodically update the knowledge base with new product information and use a dynamic prompt that includes recent interactions.

AnswerD

Continuously updating the knowledge base and prompt keeps responses accurate and fresh.

Why this answer

Implementing a feedback loop to update the knowledge base with new product information and using a dynamic prompt that includes recent interactions will keep the model relevant and reduce repetition. This RAG approach is cost-effective.

Full explanation →

218

MCQmedium

A data engineer needs to ensure that all data uploaded to an S3 bucket for SageMaker training is automatically encrypted with a customer-managed key. Which S3 feature should they enable?

A.Object lock with compliance mode.

B.Default encryption with SSE-KMS using an AWS managed key.

C.Default encryption with SSE-KMS using a customer managed key (CMK).

D.Default encryption with SSE-S3.

AnswerC

CMK provides customer control over encryption keys.

Why this answer

Option C is correct because the requirement specifies a customer-managed key (CMK). Default encryption with SSE-KMS allows you to specify a CMK, ensuring all objects uploaded to the S3 bucket are automatically encrypted using that key. This satisfies the data engineer's need for control over the encryption key.

Exam trap

Cisco often tests the distinction between AWS managed keys and customer managed keys (CMKs) in SSE-KMS, where candidates mistakenly select an AWS managed key option when the question explicitly requires a customer-managed key.

How to eliminate wrong answers

Option A is wrong because Object Lock with compliance mode is designed to prevent object deletion or overwrites for a fixed retention period, not to enforce encryption. Option B is wrong because it uses an AWS managed key, not a customer-managed key as required. Option D is wrong because SSE-S3 uses Amazon S3-managed keys, which do not provide the customer control required by the scenario.

Full explanation →

219

MCQhard

A company uses Amazon Bedrock to generate product descriptions. They want to ensure the outputs consistently follow a specific brand tone (professional yet friendly). They have a small set of example descriptions (few-shot examples) but do not want to fine-tune the model. Which strategy best achieves consistent tone without modifying the base model?

A.Fine-tune the model on a dataset of product descriptions that exemplify the desired tone.

B.Use a system prompt that defines the brand tone and include few-shot examples in the prompt.

C.Implement prompt chaining by breaking the task into multiple steps, each with its own prompt.

D.Use retrieval-augmented generation (RAG) to pull example descriptions from a database and prepend them to the prompt.

AnswerB

A system prompt sets the context and few-shot examples demonstrate the desired output style, guiding the model at inference time.

Why this answer

Using a system prompt and few-shot examples in the prompt template (Option D) provides explicit guidance to the model at inference time, shaping the tone without any model updates. Option A (retrieval-augmented generation) is for incorporating external knowledge, not tone. Option B (prompt chaining) adds complexity and may not directly enforce tone.

Option C (fine-tuning) requires modifying model weights, which is not desired.

Full explanation →

220

MCQhard

A healthcare company is developing a machine learning model using Amazon SageMaker to process protected health information (PHI). They have strict security requirements to comply with HIPAA. The company has implemented the following measures: All training data is stored in an S3 bucket with server-side encryption using AWS KMS. The data is accessed exclusively through Amazon SageMaker notebooks running in a private VPC with no internet access. The VPC has a VPC endpoint for S3 to ensure traffic stays within the AWS network. Despite these measures, a recent security audit discovered that the S3 bucket containing the training data is accessible from an IP address outside the company's network. Upon investigation, it was found that the bucket policy allowed access from any IP address due to a misconfigured bucket policy. Which corrective action should the company take to prevent this issue from recurring?

A.Add a condition in the bucket policy to restrict access to the VPC endpoint ID.

B.Move the training data to an EBS volume attached to the SageMaker notebook.

C.Remove the bucket policy and rely solely on IAM policies to control access.

D.Disable S3 Block Public Access settings to allow more granular control.

AnswerA

This ensures that only requests from the VPC endpoint can access the bucket, blocking external IPs.

Why this answer

Option B is correct because adding a condition to the bucket policy that restricts access to the VPC endpoint ID (aws:SourceVpce) ensures that only traffic from the VPC endpoint can access the bucket, preventing external IP access. Option A is incorrect because removing the bucket policy entirely would still allow access if IAM roles are not properly restricted. Option C is incorrect because disabling Block Public Access would increase risk.

Option D is incorrect because moving data to EBS is not scalable and does not address the underlying policy issue.

Full explanation →

221

MCQeasy

A company needs to summarize thousands of customer reviews daily using a foundation model. The solution must minimize latency and cost while handling variable traffic. Which AWS service should they use?

A.Amazon SageMaker real-time endpoint

B.Amazon Comprehend

C.Amazon Lex

D.Amazon Bedrock with on-demand mode

AnswerD

Bedrock on-demand is serverless and cost-effective for variable workloads.

Why this answer

Option C is correct because Amazon Bedrock with on-demand mode provides serverless access to foundation models, paying only per token used, which minimizes cost and latency for variable traffic. Option A (Amazon Comprehend) is not built for custom summarization with FMs. Option B (Amazon SageMaker real-time endpoint) requires managing infrastructure and is not cost-effective for variable loads.

Option D (Amazon Lex) is for chatbots, not summarization.

Full explanation →

222

MCQmedium

A company uses Amazon Bedrock to generate marketing copy. The summaries are too verbose. Which parameter should be decreased to directly limit the length of the output?

A.max_tokens

B.temperature

C.top_p

D.frequency_penalty

AnswerA

max_tokens sets the maximum number of tokens in the response.

Why this answer

The `max_tokens` parameter directly controls the maximum number of tokens (words or subwords) in the generated output. By decreasing this value, you explicitly cap the length of the marketing copy, making it less verbose. This is the most direct way to limit output length in Amazon Bedrock and other LLM APIs.

Exam trap

The trap here is that candidates confuse parameters that affect output style (temperature, top_p, frequency_penalty) with the one that directly controls output length (max_tokens), leading them to pick a parameter that changes how the model writes rather than how much it writes.

How to eliminate wrong answers

Option B (temperature) is wrong because it controls the randomness of token selection, not the length of the output; lowering temperature makes output more deterministic but does not shorten it. Option C (top_p) is wrong because it sets a cumulative probability threshold for nucleus sampling, affecting diversity of token choices, not the total number of tokens generated. Option D (frequency_penalty) is wrong because it penalizes tokens based on their frequency in the generated text, reducing repetition but not directly limiting the overall length of the response.

Full explanation →

223

MCQmedium

A healthcare company is using Amazon Bedrock to summarize patient notes. The compliance team requires that no patient data is used to improve the underlying foundation model. Which configuration should the team choose?

A.Enable data encryption in transit and at rest.

B.Use a different foundation model from a different provider.

C.Disable model training data logging in the AWS console.

D.Configure a VPC endpoint for Amazon Bedrock.

AnswerC

This setting prevents prompts and completions from being used for model improvement.

Why this answer

Option C is correct because disabling model training data logging in the AWS console prevents Amazon Bedrock from using customer inference data to improve the underlying foundation model. This setting ensures compliance with the requirement that no patient data is used for model training, as Bedrock offers a specific toggle to opt out of data sharing for model improvement.

Exam trap

AWS often tests the misconception that encryption or network controls (like VPC endpoints) are sufficient for data privacy compliance, when the actual requirement is about preventing data usage for model improvement, which is a separate policy control.

How to eliminate wrong answers

Option A is wrong because enabling data encryption in transit and at rest protects data confidentiality but does not prevent the foundation model provider from using the data for training or improvement. Option B is wrong because using a different foundation model from a different provider does not inherently guarantee that patient data will not be used for model improvement; the compliance requirement is about data usage policy, not model origin. Option D is wrong because configuring a VPC endpoint for Amazon Bedrock controls network access and data exfiltration but does not affect whether inference data is logged or used for model training.

Full explanation →

224

MCQeasy

A team is developing an AI system and wants to document key information such as intended use, performance benchmarks, and limitations. According to AWS best practices for responsible AI, what should they create?

A.A whitepaper

B.A business requirement document

C.A technical blog

D.Model cards

AnswerD

Model cards provide a structured summary of model characteristics, intended use, fairness, and limitations.

Why this answer

Model cards are a standardized format for documenting model details, promoting transparency. Other options are not specific to responsible AI documentation.

Full explanation →

225

MCQhard

An administrator reviews a CloudTrail log entry for a CreateModel API call. Which security concern should they investigate?

A.The model name is not encrypted in the log

B.The model data URL uses HTTP instead of HTTPS

C.The source IP address is an external IP

D.The execution role ARN is visible in the log

AnswerB

HTTPS should be used to encrypt data in transit; using HTTP is a security risk.

Why this answer

Option B is correct because using HTTP instead of HTTPS for the model data URL exposes the data in transit to potential interception or tampering. CloudTrail logs record the URL as provided, and if it uses HTTP, the data transferred from that URL to SageMaker is unencrypted, violating security best practices for data in transit. This is a direct security concern that should be investigated and remediated by using HTTPS.

Exam trap

Cisco often tests the distinction between data in transit encryption (HTTPS vs HTTP) and data at rest encryption, leading candidates to incorrectly focus on log encryption or ARN visibility instead of the actual security risk in the API call parameters.

How to eliminate wrong answers

Option A is wrong because model names are not sensitive data that require encryption in CloudTrail logs; CloudTrail logs are encrypted at rest by default using AWS KMS, but the content of the log entry (including the model name) is not individually encrypted. Option C is wrong because an external source IP address is not inherently a security concern; CloudTrail logs all API calls regardless of source, and external IPs are expected for calls made from outside AWS. Option D is wrong because the execution role ARN is a standard part of the CreateModel API call and is not sensitive; it is necessary for auditing who performed the action and does not expose credentials.

Full explanation →

Page 3 of 7

All pages

Practice AIF-C01 by domain

Target a specific domain to shore up weak areas.

Applications of Foundation Models Fundamentals of AI and ML Fundamentals of Generative AI Guidelines for Responsible AI Security, Compliance and Governance for AI Solutions

See all domains with question counts →