AWS Certified AI Practitioner AIF-C01 (AIF-C01) — Questions 175

500 questions total · 7pages · All types, answers revealed

Page 1 of 7

Page 2
1
MCQhard

A healthcare company must train a model on sensitive patient data while complying with privacy regulations. They want to add noise to the training process to prevent re-identification. Which technique should they implement?

A.Differential privacy
B.k-anonymity
C.Federated learning
D.Homomorphic encryption
AnswerA

Differential privacy adds calibrated noise to training to protect individual data points.

Why this answer

Option B is correct because differential privacy injects controlled noise into the training algorithm to protect individual records. k-anonymity (A) focuses on generalization, not noise injection. Homomorphic encryption (C) allows computation on ciphertext but is not typically used during training. Federated learning (D) trains on decentralized data but does not inherently provide differential privacy guarantees.

2
MCQeasy

A company that uses Amazon Bedrock for generating product descriptions wants to ensure that the output does not contain any confidential information from its proprietary database that is used as context. The company uses a knowledge base in Bedrock to augment the model. The security team is concerned that the model might inadvertently regurgitate exact strings from the knowledge base. The company wants to adopt a solution that prevents this while still allowing the model to use the knowledge base for generating relevant descriptions. What should the company do?

A.Configure Bedrock Guardrails with a 'Prompt/Response Output' filter to block responses that match exact phrases from the knowledge base.
B.Remove the knowledge base and rely solely on the model's pre-trained knowledge.
C.Reduce the maximum token limit for model responses so that it cannot generate long strings.
D.Encrypt the knowledge base data using AWS KMS with a unique key.
AnswerA

Guardrails can filter out responses that contain specific strings, preventing regurgitation.

Why this answer

Option A is correct because Bedrock Guardrails can be configured with a 'Prompt/Response Output' filter that uses a deny list of exact phrases or patterns. This allows the model to use the knowledge base for context while blocking any generated responses that contain verbatim strings from the proprietary database, directly addressing the security team's concern about regurgitation.

Exam trap

The trap here is that candidates may confuse data-at-rest protection (encryption) with output filtering, or assume that limiting response length prevents data leakage, when in fact only a guardrail-based output filter can block exact string matches from the generated content.

How to eliminate wrong answers

Option B is wrong because removing the knowledge base eliminates the context needed for generating relevant product descriptions, defeating the purpose of augmentation. Option C is wrong because reducing the maximum token limit does not prevent regurgitation of exact strings; it only truncates responses, and short strings can still contain confidential data. Option D is wrong because encrypting the knowledge base data with AWS KMS protects data at rest and in transit but does not control or filter the model's output, so it cannot prevent the model from generating exact matches from the decrypted context.

3
MCQhard

A data scientist is deploying a fine-tuned Mistral model on Amazon Bedrock. After deployment, inference latency is too high for real-time applications. Which configuration change can reduce latency without significantly impacting output quality?

A.Reduce max tokens from 1024 to 256
B.Switch to a larger model variant
C.Decrease the top-p to 0.5
D.Increase the temperature to 0.9
AnswerA

Generating fewer tokens speeds up inference, and for many use cases 256 tokens is sufficient.

Why this answer

Reducing the max tokens limit decreases the number of generated tokens, directly reducing latency. Lowering temperature or using a larger model may not help or may degrade quality.

4
Multi-Selecthard

Which THREE practices are recommended for responsible AI when deploying foundation models? (Choose three.)

Select 3 answers
A.Avoid collecting user feedback to reduce bias
B.Include human review for high-stakes decisions
C.Implement guardrails to filter harmful content
D.Continuously monitor model outputs for drift
E.Use a black box approach to keep model internals secret
AnswersB, C, D

Human oversight is crucial for critical applications.

Why this answer

Options A, B, and D are correct. Implementing guardrails helps prevent harmful outputs. Monitoring for drift ensures the model remains safe over time.

Human review for critical decisions adds oversight. Option C (avoiding feedback) is irresponsible. Option E (black box approach) is not recommended.

5
MCQhard

A company uses Amazon Bedrock with a custom model deployed via Amazon SageMaker. They want to monitor for data drift in input prompts over time. Which AWS service is best suited for this?

A.Amazon CloudWatch
B.Amazon SageMaker Model Monitor
C.AWS CloudTrail
D.Amazon Athena
AnswerB

Model Monitor can be configured to capture input data and detect drift using statistical methods.

Why this answer

Amazon SageMaker Model Monitor is the correct choice because it is specifically designed to detect data drift in machine learning models, including input prompts for custom models deployed via SageMaker. It continuously monitors the distribution of input data against a baseline and alerts when drift occurs, which aligns with the requirement to monitor input prompts over time.

Exam trap

The trap here is that candidates often confuse general monitoring services like CloudWatch with specialized ML monitoring tools, assuming CloudWatch can handle data drift detection when it actually lacks the statistical analysis required for such tasks.

How to eliminate wrong answers

Option A is wrong because Amazon CloudWatch is a monitoring service for AWS resources and applications (e.g., metrics, logs, alarms), but it does not have built-in capabilities to detect data drift in ML model inputs. Option C is wrong because AWS CloudTrail records API activity for auditing and governance, not for monitoring data drift in model inputs. Option D is wrong because Amazon Athena is an interactive query service for analyzing data in S3 using SQL, not a monitoring tool for data drift.

6
Multi-Selectmedium

Which THREE steps are typically involved in fine-tuning a foundation model? (Select THREE.)

Select 3 answers
A.Deploy the model immediately without additional training
B.Prepare a labeled dataset specific to the target domain
C.Train the model on the domain dataset with a lower learning rate
D.Select a pre-trained foundation model as the starting point
E.Choose a model architecture with more parameters than the base model
AnswersB, C, D

Fine-tuning requires annotated data that reflects the desired task.

Why this answer

Fine-tuning involves preparing a labeled dataset, selecting a pre-trained base model, and training it further on the domain data. Deploying without tuning is not fine-tuning, and selecting a model with more parameters is not a step.

7
MCQhard

A company is building a multi-modal application that processes images and text to answer questions about product defects. Which foundation model approach is BEST?

A.Use an image captioning model and then analyze the caption text
B.Use a text-to-image generation model and analyze the generated image
C.Use a multi-modal foundation model that processes both images and text
D.Use a separate image analysis model and a text model, then combine outputs
AnswerC

Multi-modal models are designed for joint understanding of images and text.

Why this answer

Option C is correct because multi-modal foundation models (e.g., CLIP, Flamingo, GPT-4V) are specifically designed to jointly process and reason over images and text in a unified architecture. This allows the model to directly correlate visual defects with textual descriptions without intermediate lossy transformations, making it the most effective approach for a multi-modal QA task.

Exam trap

AWS often tests the misconception that combining two separate single-modal models (Option D) is equivalent to a true multi-modal model, but the trap is that late fusion lacks the joint embedding and cross-attention mechanisms needed for coherent multi-modal reasoning.

How to eliminate wrong answers

Option A is wrong because image captioning models convert the entire image into a single text caption, losing fine-grained spatial and defect-specific details that are critical for accurate defect analysis. Option B is wrong because text-to-image generation models create new images from text, which is the inverse of the required task and cannot analyze existing product images for defects. Option D is wrong because using separate models and combining outputs introduces a late-fusion bottleneck, where alignment between visual features and text is not learned end-to-end, leading to poorer performance on tasks requiring joint reasoning.

8
MCQhard

Refer to the exhibit. You are trying to invoke a foundation model via Amazon Bedrock but receive this error. What should you do to resolve it?

A.Increase the service quota for Bedrock
B.Request model access in the Bedrock console
C.Attach the AmazonBedrockFullAccess IAM policy
D.Use a different AWS Region
AnswerB

Model access must be requested and approved before use.

Why this answer

The error indicates that the user has not been granted access to the specific foundation model in Amazon Bedrock. Even with valid IAM permissions, each model requires explicit access approval via the Bedrock console's 'Model access' section. Option B is correct because requesting model access there provisions the necessary service-level authorization.

Exam trap

AWS often tests the distinction between IAM permissions and service-level model access, trapping candidates who assume that a full-access IAM policy automatically grants access to all foundation models.

How to eliminate wrong answers

Option A is wrong because increasing the service quota addresses limits on concurrent invocations or throughput, not the 'access denied' error for a foundation model. Option C is wrong because the AmazonBedrockFullAccess IAM policy grants permissions to use Bedrock APIs, but it does not grant access to specific foundation models; model access is a separate approval process. Option D is wrong because the error is not region-specific; model access must be requested in each region where you intend to use the model, and changing regions without requesting access would produce the same error.

9
MCQhard

Refer to the exhibit. A developer runs the CLI command to summarize text using Claude v2 in Bedrock. The output is shorter than expected. Which change should the developer make to allow a longer response?

A.Increase 'max_tokens_to_sample' to 1000
B.Change the prompt to include 'Write a long summary'
C.Set 'stop_reason' to 'none'
D.Use a different region like us-west-2
AnswerA

This parameter directly controls the maximum number of tokens in the generated response.

Why this answer

The 'max_tokens_to_sample' parameter is set to 200, which limits output length. Increasing it allows longer responses. The prompt or region change does not affect length.

10
MCQmedium

A developer is using the Amazon Bedrock API to generate text. They notice that the model sometimes returns harmful content despite setting safety parameters. What is the BEST way to add an additional layer of content filtering?

A.Fine-tune the model on a curated safe dataset
B.Configure content filters in Amazon Bedrock Guardrails
C.Improve prompt engineering with more specific instructions
D.Use AWS WAF to filter API responses
AnswerB

Guardrails provide configurable content filters to block harmful content.

Why this answer

Amazon Bedrock Guardrails provides a dedicated, configurable content filtering layer that can block harmful content at inference time, independent of the model's built-in safety parameters. This allows developers to enforce custom policies (e.g., hate speech, violence) without modifying the model itself, making it the best additional safeguard.

Exam trap

Cisco often tests the misconception that fine-tuning or prompt engineering alone can fully prevent harmful outputs, when in fact a separate, configurable guardrail layer is the recommended approach for production-grade content filtering in Amazon Bedrock.

How to eliminate wrong answers

Option A is wrong because fine-tuning the model on a curated safe dataset adjusts the model's weights to reduce harmful outputs, but it does not guarantee filtering of all harmful content at inference and requires significant retraining effort; it is not an 'additional layer' but a model modification. Option C is wrong because improving prompt engineering with more specific instructions can guide the model's behavior but cannot reliably block harmful content that the model might generate despite instructions, as it lacks enforcement at the API response level. Option D is wrong because AWS WAF is a web application firewall designed to filter HTTP requests to web applications, not to inspect or filter the content of API responses from Bedrock; it operates at the network layer, not the application content layer.

11
MCQmedium

A media company uses a foundation model on Amazon Bedrock to generate article summaries. The model occasionally omits important details. Which prompt engineering technique is most likely to improve completeness?

A.Use a lower temperature setting
B.Increase the max tokens limit
C.Include a list of required key points in the prompt
D.Add 'Be concise' to the prompt
AnswerC

Specifying required content guides the model to include those details, improving completeness.

Why this answer

Providing a structured output format (e.g., bullet points, required sections) helps the model cover all aspects, reducing omissions.

12
MCQeasy

A company wants to build a chatbot that responds to customer queries using a foundation model. They need low latency and want to avoid managing infrastructure. Which AWS service should they use?

A.Amazon EC2
B.AWS Lambda
C.Amazon Bedrock
D.Amazon SageMaker
AnswerC

Correct: Bedrock is serverless and provides API access to foundation models.

Why this answer

Amazon Bedrock is a fully managed service that provides access to foundation models (FMs) from leading AI providers via a simple API, eliminating the need to manage underlying infrastructure. It is designed for building generative AI applications like chatbots with low latency, as it handles model hosting, scaling, and inference optimization automatically. This makes it the ideal choice for the company's requirement of low-latency responses without infrastructure management.

Exam trap

AWS often tests the misconception that AWS Lambda can handle any serverless workload, but candidates must recognize that Lambda is unsuitable for large model inference due to its execution time, memory, and GPU limitations, whereas Bedrock is purpose-built for foundation model access.

How to eliminate wrong answers

Option A is wrong because Amazon EC2 requires you to provision, configure, and manage virtual servers, including installing and maintaining the foundation model and its dependencies, which contradicts the requirement to avoid managing infrastructure. Option B is wrong because AWS Lambda is a serverless compute service for running short-duration code (up to 15 minutes) and is not designed to host large foundation models; it lacks the GPU support and memory capacity needed for model inference. Option D is wrong because Amazon SageMaker is a machine learning platform that requires you to manage endpoints, instances, and scaling for model deployment, which still involves infrastructure management and does not provide the fully managed, API-based access to foundation models that Bedrock offers.

13
MCQmedium

A company is building a chatbot using Amazon Bedrock. They want to ensure the model's responses are grounded in their internal knowledge base and avoid generating information outside that scope. Which feature should they use?

A.Amazon Bedrock Knowledge Bases
B.Agents for Amazon Bedrock
C.Model Evaluation on Amazon Bedrock
D.Guardrails for Amazon Bedrock
AnswerA

Knowledge Bases enable RAG by connecting FM to private data, grounding responses.

Why this answer

Amazon Bedrock Knowledge Bases is the correct feature because it allows you to connect a foundation model (FM) to your internal data sources, such as documents or databases, and use Retrieval Augmented Generation (RAG) to ground responses in that specific knowledge. This ensures the chatbot only generates information from the provided knowledge base, preventing hallucinations or out-of-scope content.

Exam trap

Cisco often tests the distinction between features that control content (Guardrails) versus features that provide source data (Knowledge Bases), leading candidates to mistakenly choose Guardrails when the question is about grounding responses in internal data.

How to eliminate wrong answers

Option B is wrong because Agents for Amazon Bedrock are designed to orchestrate multi-step tasks and interact with external APIs, not to restrict the model's responses to a specific knowledge base. Option C is wrong because Model Evaluation on Amazon Bedrock is used to assess model performance and safety, not to control the source of information for responses. Option D is wrong because Guardrails for Amazon Bedrock enforce content policies (e.g., filtering harmful or off-topic content) but do not ground responses in a specific internal knowledge base.

14
MCQhard

A company operates in a region where Amazon Bedrock is not available. They want to use generative AI but must keep data within the country. Which solution should they consider?

A.Use Amazon SageMaker to host an open-source model in the local region.
B.Wait for Bedrock to become available in their region; there is no alternative.
C.Use Amazon Bedrock in the nearest available region with cross-region inference.
D.Use an API from a third-party generative AI provider with AWS PrivateLink.
AnswerA

SageMaker is available in all regions and allows full control over data residency.

Why this answer

Option A is correct because Amazon SageMaker can deploy models in any AWS region, including those without Bedrock, and can use custom models with data staying in-region. Option B is wrong because cross-region inference sends data outside the country. Option C is wrong because using a third-party model outside AWS may not comply.

Option D is wrong because there is no region-constrained Bedrock offering.

15
MCQeasy

A data scientist is working with a dataset that contains both numerical and categorical features. Which algorithm is commonly used for regression tasks in AWS SageMaker?

A.K-Means
B.Linear Learner
C.BlazingText
D.Linear Learner
AnswerB

Linear Learner supports regression and classification on numerical and categorical features.

Why this answer

Linear Learner is the correct choice because it is a supervised learning algorithm in AWS SageMaker specifically designed for both regression and classification tasks. It can handle datasets with mixed numerical and categorical features (after appropriate encoding) and provides built-in mechanisms for training linear models, including automatic model tuning and distributed training.

Exam trap

The trap here is that candidates may confuse unsupervised clustering algorithms (like K-Means) with supervised regression algorithms, or mistakenly think that NLP-focused algorithms (like BlazingText) are appropriate for general regression tasks with mixed data types.

How to eliminate wrong answers

Option A is wrong because K-Means is an unsupervised clustering algorithm, not a regression algorithm, and it cannot predict continuous target values. Option C is wrong because BlazingText is optimized for natural language processing tasks such as word embeddings and text classification, not for general regression on mixed numerical/categorical datasets. Option D is wrong because it is a duplicate of the correct answer (B) and does not represent a distinct algorithm; the question lists two identical options, but only one is correct.

16
MCQhard

A company wants to adapt a foundation model for a custom domain with very limited labeled data and minimal cost. Which approach is most suitable?

A.Pre-training from scratch
B.Prompt engineering with few-shot examples
C.Reinforcement learning from human feedback
D.Full fine-tuning
AnswerB

This provides in-context learning with no training cost.

Why this answer

Prompt engineering with few-shot examples allows the model to learn from context without expensive fine-tuning, ideal for small datasets.

17
MCQmedium

A healthcare company is training a model on sensitive patient data using Amazon SageMaker. They need to ensure that individual patient data cannot be reverse-engineered from the model. Which technique should they implement during training?

A.Data encryption at rest
B.AWS Identity and Access Management (IAM) policies
C.Differential privacy
D.SageMaker Model Monitor
AnswerC

Differential privacy provides mathematical guarantees that the model does not memorize individual data points.

Why this answer

Differential privacy adds noise to the training process to protect individual records. Data encryption and IAM control access but do not prevent inference from model parameters; Model Monitor is for post-deployment monitoring.

18
MCQeasy

A company is using Amazon Bedrock to generate marketing copy. They want to ensure the model's responses are factually accurate and grounded in their proprietary knowledge base. Which feature should they use?

A.Model customization
B.Fine-tuning
C.Retrieval Augmented Generation (RAG)
D.Prompt engineering
AnswerC

RAG retrieves relevant documents from the knowledge base and includes them in the prompt, enabling factually grounded responses.

Why this answer

Option B, Retrieval Augmented Generation (RAG), retrieves relevant information from the company's knowledge base to ground the model's responses, improving factual accuracy. Option A (Model customization) tailors the model's behavior but does not necessarily ground responses in real-time data. Option C (Prompt engineering) relies on crafting prompts, which may not guarantee factual accuracy.

Option D (Fine-tuning) updates model weights but may not incorporate up-to-date knowledge.

19
MCQeasy

A company is using Amazon Bedrock to generate responses for customer support. They want to ensure that the model does not expose personally identifiable information (PII) in its outputs. Which AWS feature can be configured to automatically redact PII from model responses?

A.Amazon Macie
B.Amazon SageMaker Model Monitor
C.Amazon Bedrock Guardrails
D.AWS CloudTrail
AnswerC

Bedrock Guardrails can be configured to identify and redact PII from model responses.

Why this answer

Amazon Bedrock Guardrails is the correct choice because it provides configurable policies that can automatically detect and redact personally identifiable information (PII) from model inputs and outputs. This feature is specifically designed for Amazon Bedrock to enforce content safety and compliance requirements, including PII redaction, without requiring custom code or external services.

Exam trap

The trap here is that candidates may confuse Amazon Macie (a data discovery service for S3) with a real-time content filtering capability, or assume that SageMaker Model Monitor can be applied to Bedrock, when in fact only Bedrock Guardrails provides native PII redaction for model responses.

How to eliminate wrong answers

Option A is wrong because Amazon Macie is a data security service that discovers and protects sensitive data in Amazon S3, not a feature for redacting PII from model responses in Amazon Bedrock. Option B is wrong because Amazon SageMaker Model Monitor detects data drift and model quality issues for SageMaker endpoints, not for Bedrock, and does not perform PII redaction. Option D is wrong because AWS CloudTrail records API activity for auditing and governance, not for modifying or filtering model responses.

20
MCQmedium

A data scientist is using SageMaker to train a model on a dataset with many features. They suspect some features are redundant. Which feature engineering technique would help?

A.Feature scaling
B.One-hot encoding
C.Principal Component Analysis (PCA)
D.Polynomial features
AnswerC

PCA reduces dimensionality by transforming correlated features into uncorrelated components, eliminating redundancy.

Why this answer

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms the original correlated features into a smaller set of uncorrelated principal components, effectively removing redundancy while preserving most of the variance in the data. In SageMaker, PCA can be applied via the built-in PCA algorithm or as a preprocessing step in a scikit-learn container to reduce feature space and eliminate multicollinearity.

Exam trap

Cisco often tests the distinction between feature reduction (PCA) and feature transformation (scaling, encoding, polynomial expansion) to see if candidates confuse techniques that change feature count versus those that only change feature values.

How to eliminate wrong answers

Option A is wrong because feature scaling (e.g., StandardScaler, MinMaxScaler) normalizes the range of features but does not remove redundant or correlated features; it only changes the scale. Option B is wrong because one-hot encoding is used to convert categorical variables into numerical format, not to address feature redundancy among many continuous or numerical features. Option D is wrong because polynomial features create interaction and higher-order terms, which actually increase the number of features and can introduce more redundancy, not reduce it.

21
MCQmedium

Refer to the exhibit. An AWS customer runs SageMaker Clarify to evaluate bias in their training data. The report shows multiple metrics with status 'violated'. What should the customer do next?

A.Use data augmentation to balance the dataset
B.Reduce the number of features
C.Retrain the model with more data
D.Ignore the metrics because thresholds are too strict
AnswerA

Data augmentation can balance representation.

Why this answer

Option B is correct: Data augmentation or resampling can address class imbalance and demographic parity issues. Option A is wrong: Simply retraining with more data may not fix imbalance. Option C is wrong: Ignoring violations is irresponsible.

Option D is wrong: Reducing features may not help.

22
MCQhard

Refer to the exhibit. An IAM policy is attached to a user. Which models can the user invoke?

A.Only Claude v2
B.No models
C.Claude v2 and any model with a name containing 'claude'
D.Any model in the account
AnswerA

The Allow grants access to Claude v2; the Deny using NotResource denies everything else.

Why this answer

The IAM policy explicitly allows the `bedrock:InvokeModel` action only on the resource ARN `arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2`. This means the user can invoke only the Claude v2 model. No other models, including other Claude versions or any model with 'claude' in its name, are permitted because the resource ARN is exact and does not use wildcards.

Exam trap

Cisco often tests the distinction between an exact resource ARN and a wildcard pattern; candidates mistakenly assume that 'claude' in the model ID implies all Claude models are allowed, but without a wildcard, only the exact model specified is permitted.

How to eliminate wrong answers

Option B is wrong because the policy does allow invocation of Claude v2, so the user can invoke at least one model. Option C is wrong because the policy uses an exact resource ARN (`anthropic.claude-v2`), not a wildcard pattern like `*claude*`; models with 'claude' in their name but not exactly `claude-v2` are not allowed. Option D is wrong because the policy restricts invocation to a single specific model, not any model in the account.

23
MCQmedium

A data science team needs to choose a machine learning approach for a project that requires predicting customer churn based on historical data. The team has a labeled dataset with 10,000 records and needs to interpret the model's decisions to provide business insights. Which machine learning technique should the team prioritize?

A.Random forest.
B.K-means clustering.
C.Linear regression.
D.Deep neural network with multiple hidden layers.
AnswerA

Random forests provide feature importance and interpretability, suitable for classification with moderate-sized labeled datasets.

Why this answer

Option C is correct because random forests are ensemble methods that offer feature importance and decision paths, making them interpretable for churn prediction with labeled data. Option A (deep neural network) is less interpretable and may overfit with limited data. Option B (linear regression) is for regression tasks, not classification.

Option D (K-means clustering) is unsupervised and not suitable for predicting churn with labeled data.

24
MCQeasy

Refer to the exhibit. An ML team finds that their training data is stored in two subfolders under s3://my-bucket/train/. They need to ensure that the dataset is balanced for training a classification model. What should they do?

A.Use AWS Glue to create a balanced dataset
B.Use Amazon Rekognition custom labels
C.Count the number of files in each subfolder and resample
D.Enable versioning on the bucket
AnswerC

Assessing distribution and resampling ensures balance.

Why this answer

Option B is correct: They need to check the number of files in each subfolder and resample if necessary. Option A is wrong: AWS Glue can be used but not specifically for balancing. Option C is wrong: Rekognition custom labels are for image labeling.

Option D is wrong: Versioning does not affect balance.

25
MCQhard

An organization uses Amazon Bedrock to generate content. They have implemented guardrails to block toxic content. However, some users are able to bypass the guardrails by encoding their prompts. What step should be taken to improve security?

A.Encode the prompts before sending to the model.
B.Enable prompt injection detection in the guardrail configuration.
C.Use a different foundation model that is less susceptible.
D.Restrict access to the model using IAM policies.
AnswerB

Prompt injection detection can identify and block encoded or malicious prompts.

Why this answer

Option B is correct because Amazon Bedrock guardrails include a built-in prompt injection detection capability that can identify and block attempts to bypass content filters through encoded or obfuscated prompts. Enabling this feature specifically addresses the scenario where users encode their inputs to evade toxic content blocking, as it analyzes the decoded intent of the prompt rather than just the surface-level encoding.

Exam trap

Cisco often tests the misconception that encoding or encrypting inputs is a security measure, when in reality it is a common bypass technique that must be countered by content inspection mechanisms like prompt injection detection.

How to eliminate wrong answers

Option A is wrong because encoding the prompts before sending them to the model would not improve security; it would actually compound the problem by further obfuscating the input, making it harder for guardrails to detect toxic content. Option C is wrong because the susceptibility to encoded prompts is not a model-specific vulnerability; it is a function of the input processing layer, and switching foundation models would not prevent encoding-based bypasses. Option D is wrong because restricting access with IAM policies controls who can invoke the model but does not inspect or sanitize the content of prompts, so it cannot prevent users from submitting encoded toxic inputs.

26
MCQhard

A data scientist fine-tuned a large language model on Amazon SageMaker for financial report generation. The model produces responses that are too short and incomplete, often cutting off mid-sentence. What parameter should be adjusted first?

A.Increase the temperature parameter
B.Increase the top_p parameter
C.Increase the maximum token count
D.Switch to a different foundation model
AnswerC

Max tokens sets a hard limit on the number of tokens generated; raising it allows longer responses.

Why this answer

The max tokens parameter limits the length of generated responses. Increasing it allows the model to produce longer completions. Temperature, top_p, and model change affect quality or diversity, but not the length cap.

27
MCQeasy

A data scientist needs to grant an IAM user access to a specific Amazon SageMaker notebook instance. The user should only be able to start and stop the notebook instance, but not delete it. Which IAM policy statement should be used?

A.{"Effect":"Allow","Action":["sagemaker:Start*","sagemaker:Stop*"],"Resource":"*"}
B.{"Effect":"Allow","Action":["sagemaker:StartNotebookInstance","sagemaker:StopNotebookInstance"],"Resource":"arn:aws:sagemaker:us-east-1:123456789012:notebook-instance/MyNotebook"}
C.{"Effect":"Allow","Action":"sagemaker:*","Resource":"*"}
D.{"Effect":"Allow","Action":"sagemaker:*","Resource":"arn:aws:sagemaker:us-east-1:123456789012:notebook-instance/MyNotebook"}
AnswerB

Grants only start and stop on the specific resource.

Why this answer

Option B is correct because it uses the specific actions `sagemaker:StartNotebookInstance` and `sagemaker:StopNotebookInstance` with a resource ARN that targets only the intended notebook instance. This grants the least privilege required to start and stop the instance while explicitly preventing deletion, as no delete action is included. The resource ARN restricts the policy to a single notebook instance, ensuring the user cannot affect other resources.

Exam trap

The trap here is that candidates often choose a wildcard action like `sagemaker:Start*` or `sagemaker:*` thinking it covers the needed actions, but they overlook that these patterns grant unintended permissions (e.g., delete or other start/stop actions on different resources), violating the principle of least privilege.

How to eliminate wrong answers

Option A is wrong because it uses wildcard actions `sagemaker:Start*` and `sagemaker:Stop*`, which could match unintended actions like `sagemaker:StartPipelineExecution` or `sagemaker:StopTrainingJob`, and the resource `*` grants access to all SageMaker resources, violating least privilege. Option C is wrong because `sagemaker:*` allows all SageMaker actions, including `sagemaker:DeleteNotebookInstance`, which the user should not have. Option D is wrong because `sagemaker:*` on a specific resource still grants all actions on that notebook instance, including deletion, which exceeds the required permissions.

28
MCQhard

A research lab is using Amazon SageMaker to fine-tune a large language model (LLM) for scientific text summarization. The training dataset contains 10 million documents, and the lab has a limited budget but needs to minimize training time. They have access to SageMaker Training with managed spot instances, which offer significant cost savings but are interruptible. The team is considering different training strategies to balance cost, time, and model quality. Which strategy should they use?

A.Use SageMaker's distributed training with data parallelism on multiple managed spot instances, and enable checkpointing.
B.Fine-tune only the last few layers of the model on a smaller subset of the data.
C.Use a single on-demand instance to avoid interruptions and maximize throughput.
D.Use a single large GPU instance to train the model from scratch.
AnswerA

Distributed training speeds up processing, spot instances reduce cost, and checkpointing handles interruptions.

Why this answer

Option B is correct. Using SageMaker's distributed training with data parallelism on multiple spot instances, combined with checkpointing, maximizes throughput while managing interruptions. Spot instances reduce cost, and checkpointing allows resuming from failures.

Option A is incorrect because training from scratch on a single GPU is extremely slow and expensive. Option C is incorrect because on-demand instances are costly and do not optimize budget. Option D is incorrect because fine-tuning only the last few layers on a subset reduces model quality and does not effectively use the full dataset.

29
MCQmedium

An ML team is deploying a real-time inference endpoint for a computer vision model using Amazon SageMaker. The model requires GPU acceleration for low latency. Which instance type should the team choose to minimize cost while meeting the GPU requirement?

A.ml.g5.xlarge
B.ml.c5.xlarge
C.ml.p3.2xlarge
D.ml.p4d.24xlarge
AnswerC

P3 provides GPU acceleration and is cost-effective for inference.

Why this answer

Option C (ml.p3.2xlarge) is correct because it provides a GPU (NVIDIA Tesla V100) necessary for low-latency GPU acceleration in computer vision inference, while being the most cost-effective GPU instance among the options. The ml.p3.2xlarge offers a single GPU with sufficient compute for real-time inference without over-provisioning resources, minimizing cost compared to larger GPU instances like ml.p4d.24xlarge.

Exam trap

The trap here is that candidates may assume any GPU instance is acceptable, but Cisco tests the ability to balance GPU requirement with cost minimization, leading them to pick the cheapest GPU option (ml.g5.xlarge) without recognizing that ml.p3.2xlarge is even more cost-effective for this specific workload.

How to eliminate wrong answers

Option A (ml.g5.xlarge) is wrong because it uses a GPU (NVIDIA A10G) that is more powerful and expensive than needed for this use case, leading to higher cost; however, it does meet the GPU requirement, so the primary issue is cost inefficiency, not technical incompatibility. Option B (ml.c5.xlarge) is wrong because it is a CPU-only instance (based on Intel Xeon Scalable processors) and lacks any GPU, failing to meet the explicit GPU acceleration requirement for low-latency computer vision inference. Option D (ml.p4d.24xlarge) is wrong because it provides 8 NVIDIA A100 GPUs, which is massively over-provisioned for a single real-time inference endpoint, resulting in significantly higher cost without any benefit for this workload.

30
MCQmedium

A company deploys an Amazon Bedrock agent that uses a knowledge base with sensitive financial documents. The security team requires that all data retrieval queries be logged for auditing, and that the LLM responses do not contain any personally identifiable information (PII). What combination of services should the company use?

A.Enable AWS CloudTrail for API logging and use Amazon GuardDuty to detect PII in responses.
B.Use AWS Config to monitor Bedrock resource configurations and apply an IAM policy to restrict PII.
C.Enable Amazon CloudWatch Logs for the Bedrock agent and use Amazon Comprehend to redact PII from responses.
D.Use Amazon S3 server access logs for the knowledge base and enable Amazon Macie to redact PII.
AnswerC

CloudWatch Logs can capture detailed logs from Bedrock, and Comprehend (or Bedrock Guardrails) can identify and mask PII.

Why this answer

Option B is correct because Amazon CloudWatch Logs can capture query logs, and Amazon Comprehend (or Bedrock Guardrails) can detect and redact PII. Option A is wrong because GuardDuty is for threat detection, not PII redaction. Option C is wrong because AWS Config is for resource compliance, not logging queries.

Option D is wrong because Macie is for data discovery in S3, not real-time PII redaction in responses.

31
MCQhard

A financial services company is deploying a foundation model on Amazon Bedrock to generate compliance reports from internal audit logs. The model must not output any personally identifiable information (PII). They have configured a Bedrock Guardrail with sensitive information filters set to the 'HIGH' sensitivity level. During testing in a staging environment, testers still observed PII being occasionally generated in the report outputs. The guardrail did not block these instances because the PII was embedded in a context that the guardrail's pattern matching did not catch (e.g., structured JSON data with embedded names). The company requires a solution that minimizes latency and cost, as they process thousands of reports daily. They cannot afford to increase inference time significantly due to strict SLAs. They also want to avoid re-engineering the entire solution. Which additional step should they take to effectively eliminate PII leakage while maintaining performance?

A.Add a prompt instruction to the model to never output PII, with few-shot examples of non-PII outputs.
B.Fine-tune the foundation model on a dataset that excludes PII.
C.Increase the guardrail sensitivity to 'MAXIMUM'.
D.Implement a post-processing Lambda function that uses Amazon Comprehend's PII detection to scan and redact any PII from the model output before returning it.
AnswerD

Correct: Amazon Comprehend provides robust PII detection that can catch context-based PII. The Lambda function can be optimized for low latency and added cost is minimal.

Why this answer

Option C adds a dedicated PII detection layer using Amazon Comprehend, which is accurate and can be optimized for latency. Option A may increase false positives and misses. Option B is expensive.

Option D is unreliable.

32
Multi-Selecthard

Which TWO are best practices for model monitoring in production on AWS?

Select 2 answers
A.Disable logging to reduce latency
B.Use only CPU instances
C.Monitor input data drift
D.Retrain model daily
E.Monitor prediction drift
AnswersC, E

Data drift detection helps identify when the distribution of input data changes, affecting model accuracy.

Why this answer

Option C is correct because monitoring input data drift is a best practice for detecting changes in the distribution of incoming features compared to the training data. This helps identify when the model's assumptions about the data are no longer valid, which can degrade performance. AWS services like Amazon SageMaker Model Monitor can automatically track and alert on data drift.

Exam trap

Cisco often tests the misconception that retraining on a fixed schedule (e.g., daily) is a best practice, when in reality it should be event-driven based on drift or performance metrics.

33
MCQeasy

Which AWS service provides a serverless experience for building and scaling generative AI applications with access to various foundation models?

A.Amazon Bedrock
B.Amazon SageMaker
C.Amazon Lex
D.AWS Lambda
AnswerA

Bedrock provides a serverless experience with pre-trained foundation models from leading AI companies.

Why this answer

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models through a single API, without managing the underlying infrastructure.

34
MCQmedium

A company wants to use Amazon Bedrock to generate images from text descriptions. Which model should they use?

A.Amazon Titan Image Generator
B.Stable Diffusion XL
C.Amazon Titan Text
D.Amazon Polly
AnswerA

Titan Image Generator is an AWS-managed model for generating images from text.

Why this answer

Option B is correct because Amazon Titan Image Generator is designed for image generation, and it is fully managed by AWS. Option A (Stable Diffusion XL) is also available but not AWS-native. Option C (Titan Text) is for text.

Option D (Polly) is for speech.

35
MCQeasy

Refer to the exhibit. A data scientist ran a training job on Amazon SageMaker. The job failed with the error shown. What is the most likely cause?

A.The S3 input path is incorrect
B.The IAM role does not have permission to access S3
C.The training code has a syntax error
D.The batch size is too large for the instance's GPU memory
AnswerD

The error shows CUDA out of memory, typically due to batch size or model size exceeding GPU memory.

Why this answer

The error message indicates a CUDA out-of-memory error, which occurs when the GPU memory is insufficient for the requested batch size. Option D is correct because increasing the batch size beyond the GPU's memory capacity causes the training job to fail with this specific error.

Exam trap

AWS often tests the distinction between infrastructure errors (S3, IAM) and runtime errors (CUDA memory), where candidates mistakenly attribute a GPU memory error to a misconfiguration in data access or code syntax.

How to eliminate wrong answers

Option A is wrong because an incorrect S3 input path would result in a 'NoSuchKey' or '404' error, not a CUDA out-of-memory error. Option B is wrong because an IAM role lacking S3 permissions would produce an 'AccessDenied' error, not a GPU memory error. Option C is wrong because a syntax error in the training code would raise a Python exception (e.g., SyntaxError) before any GPU operations, not a CUDA memory error.

36
MCQhard

A financial services company is deploying a generative AI model on Amazon SageMaker for real-time fraud detection. The model, a fine-tuned Llama 2 7B, must respond to transaction requests within 500 milliseconds. The team has deployed the model using a SageMaker real-time endpoint with a single ml.g5.2xlarge instance. During load testing, the endpoint achieves an average latency of 450 ms at 10 requests per second (RPS), but the latency spikes to over 2 seconds at 20 RPS. The team needs to maintain sub-500 ms latency at up to 50 RPS. The model is too large to fit on a single GPU, so they are using CPU instances. They considered using a larger instance type but want to minimize cost. What should the team do to meet the latency requirement cost-effectively?

A.Upgrade to a single ml.g5.4xlarge instance
B.Attach an Amazon Elastic Inference accelerator to the existing instance
C.Use a SageMaker multi-model endpoint with multiple ml.g5.xlarge instances and auto scaling
D.Use SageMaker Serverless Inference to automatically scale
AnswerC

Distributing load across smaller instances reduces cost and meets latency via scaling.

Why this answer

Option C is correct because a SageMaker multi-model endpoint (MME) allows multiple model replicas to be hosted on a fleet of instances, enabling horizontal scaling to handle increased throughput. By using multiple ml.g5.xlarge instances with auto scaling, the team can distribute the 50 RPS load across several instances, keeping per-instance latency low while minimizing cost compared to a single larger instance. This approach also leverages the fact that the model is too large for a single GPU but can be efficiently served on CPU instances with proper load distribution.

Exam trap

The trap here is that candidates assume a larger single instance (Option A) is the simplest solution, but they overlook the cost-efficiency and scalability benefits of horizontal scaling with a multi-model endpoint, which is specifically designed for high-throughput, low-latency inference with models that don't fit on a single GPU.

How to eliminate wrong answers

Option A is wrong because upgrading to a single ml.g5.4xlarge instance provides more vCPUs and memory but does not address the fundamental bottleneck of a single instance handling 50 RPS; latency would still spike due to sequential processing limits. Option B is wrong because Amazon Elastic Inference (EI) accelerators are designed for low-latency GPU-based inference and are not compatible with CPU-only instances; they also cannot help if the model does not fit on a single GPU. Option D is wrong because SageMaker Serverless Inference has a maximum concurrency limit and cold start latency that can exceed 500 ms, making it unsuitable for real-time fraud detection with strict sub-500 ms latency requirements.

37
MCQhard

A data science team fine-tuned a foundation model on Amazon SageMaker for sentiment analysis of customer reviews. They deployed the model as a real-time endpoint. After a successful launch, the application experienced a surge in traffic, and the endpoint's latency increased from 200ms to over 2 seconds. The team needs to reduce latency and maintain high throughput without increasing costs significantly. They are using a single ml.g5.xlarge instance. What change should the team make first?

A.Enable automatic scaling for the endpoint.
B.Increase the instance type to ml.g5.4xlarge.
C.Compile the model using SageMaker Neo to optimize for inference.
D.Switch to a batch transform job instead of real-time.
AnswerC

Neo optimizes models for specific hardware, improving speed without additional cost.

Why this answer

Option C is correct because compiling the model with SageMaker Neo optimizes the model for the target hardware, significantly reducing inference latency without increasing compute cost. Option A (upgrade instance) increases cost. Option B (switch to batch) is not suitable for real-time.

Option D (auto scaling) adds instances but does not reduce per-request latency; it may increase cost.

38
MCQmedium

An e-commerce company uses an Amazon Lex chatbot to handle customer inquiries. They want to implement human oversight for sensitive interactions, such as when the chatbot cannot provide a confident response. Which AWS service should they integrate?

A.Amazon Rekognition
B.Amazon Comprehend
C.Amazon Augmented AI (A2I)
D.Amazon SageMaker Ground Truth
AnswerC

A2I provides workflows to route predictions to humans for review when confidence is low.

Why this answer

Amazon Augmented AI (A2I) enables human review loops for low-confidence predictions or sensitive cases. Other services are not designed for human-in-the-loop in conversational AI.

39
MCQeasy

A developer invoked an Amazon Bedrock model and received this output. What does the stopReason field indicate?

A.A content filter blocked the output
B.The input prompt was too long
C.The model reached the maximum token limit set in the request
D.The model reached a natural stopping point
AnswerC

The stopReason 'max_tokens' explicitly indicates the output was truncated due to the token limit.

Why this answer

Option A is correct because stopReason 'max_tokens' means the model stopped because it reached the maximum token limit specified in the request. Option B (natural stop) would be 'stop' or 'end_turn'. Option C (input too long) would be a different error.

Option D (content filter) would be 'content_filtered'.

40
MCQhard

A company is using Amazon SageMaker Ground Truth to create labeled datasets for a computer vision model. The dataset contains images of people in public places. The company must comply with data privacy regulations that require explicit consent for using images of individuals. The company has a privacy team that reviews the images and provides consent lists. The ML team suspects that some images in the dataset might include individuals who have not consented. The company wants to ensure that only those images with consent are used for training. What should the company do?

A.Apply a blur filter to all faces in the dataset using Amazon Rekognition before labeling.
B.Use Amazon Rekognition to detect faces in all images and re-label those without consent as invalid.
C.Create an Amazon Simple Workflow Service (SWF) workflow that cross-references image metadata with the consent list, and update the Ground Truth manifest to include only approved images.
D.Use Amazon SageMaker Clarify to detect bias in the training data and exclude images of people.
AnswerC

This creates an automated pipeline to filter approved images based on the consent list, using SWF for workflow orchestration.

Why this answer

Option C is correct because it uses Amazon Simple Workflow Service (SWF) to orchestrate a cross-referencing workflow between image metadata and the consent list, then updates the SageMaker Ground Truth manifest to include only approved images. This ensures that only images with explicit consent are used for training, directly addressing the data privacy compliance requirement without altering or mislabeling the data.

Exam trap

The trap here is that candidates may confuse privacy compliance with data anonymization (blurring faces) or bias detection, rather than recognizing that explicit consent requires a cross-referencing workflow against an external consent list, which is best orchestrated by a workflow service like SWF.

How to eliminate wrong answers

Option A is wrong because applying a blur filter to all faces using Amazon Rekognition does not remove images of individuals without consent; it only obscures faces, which may still violate privacy regulations if the image itself is used without consent. Option B is wrong because using Amazon Rekognition to detect faces and re-label images as invalid does not cross-reference a consent list; it only marks images based on face detection, not on actual consent status, and could incorrectly exclude or include images. Option D is wrong because Amazon SageMaker Clarify is designed to detect bias in training data and model predictions, not to manage consent compliance or exclude images based on privacy consent lists.

41
Multi-Selectmedium

Which TWO AWS services can be used to build a chatbot that responds to customer inquiries using a company's documentation as source? (Select two.)

Select 2 answers
A.Amazon Bedrock with RAG
B.Amazon Polly
C.Amazon Q Business
D.Amazon Transcribe
E.Amazon Lex
AnswersA, C

Bedrock with RAG can retrieve from documentation and generate answers using foundation models.

Why this answer

Options B (Amazon Bedrock with RAG) and C (Amazon Q Business) are correct. Bedrock with RAG retrieves from the documentation to generate answers, and Amazon Q Business is a conversational service that can use enterprise data. Option A (Amazon Lex) can build chatbots but requires integration for documentation retrieval.

Option D (Amazon Polly) is text-to-speech. Option E (Amazon Transcribe) is speech-to-text.

42
MCQmedium

A team is deploying a regression model for loan approval. To ensure transparency for regulators, they need to explain individual predictions. Which interpretability method can provide local explanations by approximating the model with a simpler surrogate?

A.SHAP values
B.Partial dependence plots
C.LIME
D.Permutation feature importance
AnswerC

LIME creates local surrogate models to explain individual predictions.

Why this answer

Option C is correct because LIME (Local Interpretable Model-agnostic Explanations) generates local explanations by fitting a simpler model around each prediction. SHAP (option A) is also local but uses Shapley values; partial dependence (option B) is global; permutation importance (option D) is global feature importance.

43
Multi-Selecthard

Which TWO factors should be considered when choosing between a CPU-based instance and a GPU-based instance for training a machine learning model on Amazon SageMaker? (Choose two.)

Select 2 answers
A.The number of layers in the model
B.The AWS Region
C.The size of the dataset
D.The choice of hyperparameter optimizer
E.The type of model architecture (e.g., CNN vs. linear regression)
AnswersC, E

Large datasets can leverage GPU parallelism.

Why this answer

Option C is correct because the size of the dataset directly impacts whether a GPU's parallel processing capabilities are beneficial. GPU instances excel at performing many matrix operations simultaneously, which is critical for large datasets where mini-batch gradient descent can be parallelized. For smaller datasets, the overhead of transferring data to GPU memory may negate the performance gains, making CPU instances more cost-effective.

Exam trap

Cisco often tests the misconception that model architecture alone (e.g., number of layers) dictates hardware choice, when in fact the dataset size and model type (e.g., CNN vs. linear regression) are the key factors that determine whether GPU parallelism provides a meaningful advantage.

44
MCQeasy

A company is using Amazon Bedrock to generate code snippets. Developers report that the generated code sometimes contains security vulnerabilities. Which action should the team take to mitigate this risk?

A.Deploy the model in a sandbox environment to limit its access to sensitive systems.
B.Implement a manual code review process after generation.
C.Add a system prompt that instructs the model to follow security best practices and avoid known vulnerabilities.
D.Reduce the temperature parameter to 0 to make the output deterministic.
AnswerC

A system prompt sets expectations and can reduce the likelihood of insecure code generation.

Why this answer

Option C is correct because adding a system prompt that instructs the model to follow security best practices and avoid known vulnerabilities directly influences the model's output at inference time. Amazon Bedrock supports system prompts that act as high-level instructions to guide the foundation model's behavior, making this a proactive, scalable mitigation that does not require manual intervention or architectural changes.

Exam trap

AWS often tests the misconception that reducing temperature or isolating the environment can fix output quality issues, when in fact only prompt-level guidance directly addresses the model's generation behavior.

How to eliminate wrong answers

Option A is wrong because deploying the model in a sandbox environment limits access to sensitive systems but does not prevent the model from generating code with security vulnerabilities; the model's output itself remains unchanged. Option B is wrong because implementing a manual code review process after generation is a reactive measure that does not reduce the risk at the source; it adds human overhead and delays, and is not a mitigation that addresses the model's tendency to produce insecure code. Option D is wrong because reducing the temperature parameter to 0 makes the output deterministic but does not teach or enforce security best practices; it only reduces randomness, not the likelihood of generating vulnerable patterns.

45
MCQmedium

A company is training a large language model using Amazon SageMaker. The training job fails with the error 'OutOfMemory'. They are using a single ml.p3.2xlarge instance. The dataset is 50GB and the model is 2GB. The training script uses standard data loading. Which action should they take to resolve the issue?

A.Increase the instance type to ml.p3.16xlarge
B.Train the model using Spot instances
C.Reduce the batch size
D.Use SageMaker's Pipe mode for data loading
AnswerA

The error indicates the instance memory is insufficient. Upgrading to a larger instance directly addresses the out-of-memory issue.

Why this answer

The error 'OutOfMemory' indicates that the ml.p3.2xlarge instance (with 16 GB GPU memory) cannot hold both the 2 GB model and the 50 GB dataset during training. Increasing the instance type to ml.p3.16xlarge provides 64 GB GPU memory, which is sufficient to accommodate the model and dataset without memory pressure. This directly resolves the resource constraint.

Exam trap

Cisco often tests the misconception that reducing batch size or using Pipe mode can solve out-of-memory errors caused by insufficient GPU memory, when the real fix is to use a larger instance with more GPU memory.

How to eliminate wrong answers

Option B is wrong because Spot instances provide cost savings but do not increase memory capacity; they use the same instance types and would still run out of memory. Option C is wrong because reducing the batch size reduces memory usage per step but does not address the fundamental issue that the total dataset (50 GB) cannot fit into the 16 GB GPU memory of the current instance; the model alone is 2 GB, leaving insufficient room for data. Option D is wrong because SageMaker's Pipe mode streams data directly from Amazon S3 to the training algorithm without storing it on disk, but the GPU memory is still required to hold the model and the data batches during processing; Pipe mode does not reduce GPU memory consumption.

46
MCQmedium

A user has this IAM policy and attempts to invoke the model in the us-west-2 region. They receive an AccessDenied error. What is the reason?

A.The resource ARN is malformed
B.The action 'bedrock:InvokeModel' is incorrect
C.The model ID is case-sensitive
D.The policy does not allow the us-west-2 region
AnswerD

The ARN's region is us-east-1, so the policy does not grant access in us-west-2.

Why this answer

The resource ARN specifies us-east-1, so the policy grants access only in that region. Option B (action) is correct but not the issue. Option C (case sensitivity) is not relevant.

Option D (malformed) is false.

47
MCQeasy

A company wants to use a foundation model to automatically moderate user-generated content. The model must filter out inappropriate content with high accuracy. Which Amazon service is best suited for this task?

A.Amazon Translate
B.Amazon Rekognition
C.Amazon Polly
D.Amazon Comprehend
AnswerD

Comprehend offers content moderation features.

Why this answer

Amazon Comprehend is the correct choice because it is a natural language processing (NLP) service that can analyze text for sentiment, key phrases, and — critically — toxicity and inappropriate content using built-in or custom classifiers. This directly matches the requirement to moderate user-generated text with high accuracy, as it can detect hate speech, profanity, and other harmful language.

Exam trap

The trap here is that candidates may confuse Amazon Rekognition's ability to detect unsafe content in images with the need to moderate text, leading them to select Rekognition instead of recognizing that Comprehend is the NLP service for text analysis.

How to eliminate wrong answers

Option A is wrong because Amazon Translate is a machine translation service that converts text between languages; it has no capability to analyze or moderate content for appropriateness. Option B is wrong because Amazon Rekognition is designed for image and video analysis (e.g., object detection, facial recognition, unsafe content detection in visual media), not for moderating text-based user-generated content. Option C is wrong because Amazon Polly is a text-to-speech service that converts written text into lifelike speech; it performs no content moderation or analysis.

48
MCQeasy

A startup needs to build a real-time text translation feature for a customer chat application. Latency must be under 200 ms per request. Which AWS approach is BEST suited?

A.Use Amazon Comprehend for language detection and a custom translation model
B.Use Amazon Bedrock with a multilingual foundation model
C.Use Amazon Translate with real-time translation
D.Use Amazon Transcribe and then Amazon Bedrock
AnswerC

Amazon Translate is a purpose-built service for translation with low latency.

Why this answer

Amazon Translate's real-time translation API is purpose-built for low-latency text translation, typically achieving sub-200 ms response times for small payloads. It directly translates text without the overhead of running a large foundation model or performing intermediate steps like transcription, making it the best fit for this latency-sensitive chat application.

Exam trap

The trap here is that candidates may assume a large foundation model (e.g., via Bedrock) is always the best choice for multilingual tasks, overlooking that purpose-built services like Amazon Translate are specifically optimized for low-latency, high-throughput translation at a fraction of the cost and complexity.

How to eliminate wrong answers

Option A is wrong because Amazon Comprehend is designed for natural language processing (e.g., sentiment analysis, entity extraction), not real-time translation, and building a custom translation model would introduce significant latency and complexity. Option B is wrong because Amazon Bedrock with a multilingual foundation model introduces inference latency that often exceeds 200 ms for real-time requests, and it is not optimized for the single-purpose, high-throughput translation task required here. Option D is wrong because Amazon Transcribe is for speech-to-text, not text translation, and chaining it with Bedrock adds unnecessary latency and complexity for a text-only translation feature.

49
Multi-Selecthard

A financial services company is deploying a machine learning model that must comply with SOC 2 and PCI DSS. They need to ensure that the model artifacts and training data are encrypted, access is audited, and the environment is protected from network threats. Which THREE AWS services should they implement?

Select 3 answers
A.Amazon DynamoDB Accelerator (DAX)
B.Amazon GuardDuty
C.AWS CloudTrail
D.AWS KMS
E.AWS WAF
AnswersB, C, D

GuardDuty continuously monitors for malicious activity and network threats.

Why this answer

Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior, which helps protect the environment from network threats as required by SOC 2 and PCI DSS. By analyzing VPC Flow Logs, DNS logs, and CloudTrail events, it can detect anomalies such as port scanning or data exfiltration, directly addressing the need for network threat protection in a compliant ML deployment.

Exam trap

AWS often tests the distinction between network threat detection (GuardDuty) and web application layer protection (WAF), leading candidates to incorrectly choose WAF for general network threat protection when it only addresses HTTP/S-based attacks.

50
Multi-Selectmedium

A company is building a chatbot using Amazon Bedrock. They want to ensure the model's responses are grounded in company-specific data and that harmful content is filtered out. Which two services or features should they use? (Choose TWO.)

Select 2 answers
A.Amazon Kendra
B.Bedrock Agents
C.Amazon Comprehend
D.Bedrock Guardrails
E.Amazon SageMaker JumpStart
AnswersA, D

Correct: Amazon Kendra can be used as a knowledge base for RAG to ground responses in company data.

Why this answer

Amazon Kendra is correct because it provides a managed search service that indexes company-specific data sources, enabling the Bedrock chatbot to retrieve relevant documents and ground its responses in authoritative information. Bedrock Guardrails is correct because it allows you to define content filters and topic policies to block harmful or undesirable outputs, ensuring the chatbot adheres to safety and compliance requirements.

Exam trap

AWS often tests the distinction between services that provide grounding (Kendra) versus those that orchestrate actions (Agents), and between content filtering (Guardrails) versus general NLP (Comprehend), leading candidates to confuse the roles of Bedrock Agents and Amazon Comprehend.

51
MCQeasy

A developer runs this AWS CLI command to invoke a model in us-west-2 but receives an error: 'An error occurred (ModelNotFoundException) when calling the InvokeModel operation: Model not found'. What is the most likely cause?

A.The request body is not properly formatted
B.The --region parameter is missing from the command
C.The model is not available in the us-west-2 region
D.The user's IAM role lacks permissions to invoke the model
AnswerC

Claude v2 is available only in certain regions like us-east-1.

Why this answer

The model anthropic.claude-v2 is not available in us-west-2; it is available in us-east-1. Option A (permissions) would give AccessDenied. Option B (body format) would give validation error.

Option D (region missing) is not the issue as region is specified.

52
MCQeasy

Refer to the exhibit. An AWS administrator sets up a SageMaker Model Monitor schedule for bias detection. What is the primary purpose of this configuration?

A.To retrain the model weekly
B.To replace the model with a new version
C.To monitor data drift in training data
D.To generate weekly bias reports for the deployed endpoint
AnswerD

The schedule runs monitoring jobs weekly, comparing current predictions to baseline and outputting results to S3.

Why this answer

The configuration creates a weekly monitoring job (cron expression for Monday) that compares endpoint predictions against a baseline and outputs bias reports to S3. It does not retrain, replace the model, or monitor training data drift.

53
Multi-Selecteasy

A data scientist wants to deploy a custom model built with TensorFlow to Amazon SageMaker for real-time inference. Which TWO steps are required? (Choose two.)

Select 2 answers
A.Create an Amazon ECR repository for the inference container
B.Upload the model artifacts to an S3 bucket
C.Submit a training job to SageMaker
D.Create a SageMaker endpoint configuration
E.Convert the model to ONNX format
AnswersB, D

Model artifacts must be stored in S3 for SageMaker to access.

Why this answer

Option B is correct because SageMaker requires model artifacts (the trained model files) to be stored in an S3 bucket before they can be used for inference. When deploying a custom TensorFlow model, you must upload the saved model (e.g., in SavedModel format) to S3, and then SageMaker will download it to the inference container during endpoint creation.

Exam trap

The trap here is that candidates often think they must build a custom container (Option A) or convert the model (Option E), but SageMaker's pre-built TensorFlow containers eliminate those steps, and the key requirements are simply uploading artifacts to S3 and creating the endpoint configuration.

54
MCQmedium

A developer is using the Amazon Bedrock InvokeModel API with the above request to summarize meeting notes. The response is a single word repeated many times. Which parameter is MOST likely causing this issue?

A.topP set to 0.9
B.stopSequences is empty
C.maxTokenCount set to 100
D.temperature set to 0
AnswerD

Temperature 0 makes output deterministic and prone to repetition.

Why this answer

A temperature of 0 forces the model to always select the highest-probability token at each step, which can lead to repetitive loops if the most likely token repeatedly points back to itself (e.g., the same word). This deterministic behavior eliminates randomness, causing the model to get stuck in a single-word cycle rather than generating diverse or coherent text.

Exam trap

AWS often tests the misconception that temperature only affects 'creativity' or 'randomness,' when in fact a temperature of 0 causes deterministic argmax selection, which can paradoxically produce repetitive or stuck outputs rather than simply 'less creative' text.

How to eliminate wrong answers

Option A is wrong because topP set to 0.9 (nucleus sampling) actually increases diversity by considering tokens whose cumulative probability reaches 0.9, which would reduce repetition, not cause it. Option B is wrong because an empty stopSequences list means no custom stopping conditions are applied, but this does not force repetition; the model would still generate until a natural stop (e.g., EOS token) or maxTokenCount is reached. Option C is wrong because maxTokenCount set to 100 only limits the total number of tokens generated; it does not influence token selection probability or cause a single word to repeat—it would simply stop after 100 tokens regardless of content.

55
MCQhard

A data scientist runs the SageMaker Clarify job shown in the exhibit for a credit risk model. After reviewing the results, they find a high bias metric for the gender facet. Which action is most consistent with responsible AI?

A.Proceed with deployment because the model is already in production
B.Remove the gender attribute from the training data and retrain
C.Investigate the root cause and retrain with balanced data
D.Increase the acceptance threshold for the model
AnswerC

Root cause analysis and retraining address bias.

Why this answer

Option C is correct because responsible AI requires understanding and mitigating bias at its source, not just masking it. Investigating the root cause (e.g., data collection bias, labeling bias, or proxy features) and retraining with balanced data directly addresses the high bias metric detected by SageMaker Clarify, aligning with AWS's principle of fairness. Simply removing the gender attribute may not eliminate bias if other features act as proxies, and increasing the threshold does not fix the underlying model bias.

Exam trap

Cisco often tests the misconception that simply removing a sensitive attribute (like gender) is sufficient to eliminate bias, but the trap here is that proxy features can still encode the same bias, making root-cause investigation and balanced retraining the only responsible action.

How to eliminate wrong answers

Option A is wrong because deploying a model with a known high bias metric violates responsible AI principles and could lead to unfair outcomes, even if the model is already in production; SageMaker Clarify is designed to detect such issues before or during deployment. Option B is wrong because removing the gender attribute alone does not guarantee bias removal—other features like zip code or income can act as proxies for gender, and the model may still learn biased correlations. Option D is wrong because increasing the acceptance threshold (e.g., for a binary classifier) only changes the decision boundary, not the underlying biased patterns learned by the model; it does not reduce the bias metric reported by Clarify.

56
MCQeasy

A social media company uses Amazon Comprehend to moderate user comments. They want to avoid censoring legitimate speech while catching hate speech. Which approach aligns with responsible AI governance?

A.Implement a human-in-the-loop review for borderline cases
B.Use multiple models and average their scores
C.Use a single model with high confidence threshold
D.Rely solely on automated filtering
AnswerA

Human review adds nuance.

Why this answer

Option B is correct: Human-in-the-loop review for borderline cases reduces false positives. Option A is wrong: High threshold may miss hate speech. Option C is wrong: Full automation can censor legitimate speech.

Option D is wrong: Averaging models may not help.

57
MCQmedium

A company wants to personalize its generative AI model for its specific domain without sharing data with third-party model providers. Which method should they use?

A.Fine-tuning the foundation model on their proprietary data
B.Prompt engineering with domain-specific examples
C.Retrieval-augmented generation (RAG) with a domain-specific knowledge base
D.Model distillation using a larger foundation model
AnswerA

Fine-tuning adapts the model to the domain using private data, and Bedrock supports this.

Why this answer

Amazon Bedrock allows fine-tuning of foundation models with customer data in a secure environment. Option A (Prompt engineering) doesn't personalize deeply. Option B (RAG) adds context but doesn't modify model.

Option C (Distillation) requires an existing model.

58
MCQeasy

A startup wants to build a product recommendation engine for their e-commerce platform. They have user purchase history and item metadata. They want a fully managed solution that can automatically train and deploy a recommendation model without needing to manage the underlying ML lifecycle. The solution should provide personalized recommendations based on collaborative filtering. Which AWS service should they use?

A.Use Amazon Kendra
B.Use Amazon Lex
C.Use Amazon Personalize
D.Use Amazon SageMaker built-in Factorization Machines algorithm
AnswerC

Personalize is fully managed and specifically designed for recommendation systems.

Why this answer

Amazon Personalize is a fully managed service that enables you to build and deploy recommendation models without managing the underlying ML lifecycle. It supports collaborative filtering out of the box, using user purchase history and item metadata to generate personalized recommendations, which directly matches the startup's requirements.

Exam trap

The trap here is that candidates may confuse Amazon SageMaker's built-in algorithms (like Factorization Machines) with a fully managed recommendation service, overlooking the requirement for automatic lifecycle management and instead focusing only on the algorithm capability.

How to eliminate wrong answers

Option A is wrong because Amazon Kendra is an intelligent search service that uses natural language processing to answer questions and retrieve documents, not a recommendation engine for collaborative filtering. Option B is wrong because Amazon Lex is a service for building conversational interfaces (chatbots) using speech and text, not for generating product recommendations. Option D is wrong because while Amazon SageMaker's built-in Factorization Machines algorithm can perform collaborative filtering, it requires you to manage the ML lifecycle (data preparation, training, deployment, scaling), which contradicts the requirement for a fully managed solution that automatically handles these tasks.

59
Multi-Selectmedium

Which THREE practices are recommended for promoting robustness and security in AI systems?

Select 3 answers
A.Deploy the model immediately after training without validation
B.Implement strong access controls and encryption for model artifacts
C.Regularly test the model against adversarial examples
D.Monitor model performance for data drift and concept drift
E.Remove logging and monitoring to improve performance
AnswersB, C, D

Security controls protect models from unauthorized access and tampering.

Why this answer

Robustness and security involve testing for adversarial inputs, monitoring data drift, and implementing secure access controls. Using unvalidated models is risky. Removing logging reduces traceability.

60
MCQeasy

A data scientist wants to quickly experiment with a pre-trained LLM for text generation without writing any code. Which AWS service is MOST suitable?

A.Amazon Bedrock
B.Amazon EC2
C.Amazon SageMaker JumpStart
D.AWS Lambda
AnswerC

SageMaker JumpStart offers pre-trained models with a simple deployment interface.

Why this answer

Amazon SageMaker JumpStart provides a curated set of pre-trained foundation models (including LLMs) that can be deployed with just a few clicks, requiring no code. This makes it the most suitable service for a data scientist who wants to quickly experiment with a pre-trained LLM for text generation without writing any code.

Exam trap

The trap here is that candidates may confuse Amazon Bedrock's managed API access to foundation models with a no-code solution, but Bedrock still requires code to call the API, whereas SageMaker JumpStart offers a true no-code deployment and testing interface.

How to eliminate wrong answers

Option A is wrong because Amazon Bedrock is a serverless service for building generative AI applications using foundation models via API calls, but it still requires at least minimal code (e.g., SDK or CLI) to invoke the model, not a no-code experiment. Option B is wrong because Amazon EC2 requires manual setup of the OS, environment, and model deployment, which involves significant code and configuration, not a quick no-code experiment. Option D is wrong because AWS Lambda is a serverless compute service for running code in response to events, and it requires writing and deploying code to invoke an LLM, not a no-code solution.

61
MCQmedium

A data scientist is training a binary classification model to predict customer churn. The dataset has 10,000 records with 9,500 non-churners and 500 churners. After training a logistic regression model, the model achieves 95% accuracy on the test set. However, the business team reports that the model is not useful because it predicts almost all customers as non-churners. Which metric should the data scientist use to evaluate the model's performance in this scenario?

A.Accuracy
B.R-squared
C.Precision
D.Recall
AnswerD

Recall measures the proportion of actual churners correctly identified, which is the key metric for this imbalanced problem.

Why this answer

Option D (Recall) is correct because in this highly imbalanced dataset (95% non-churners vs 5% churners), the model's 95% accuracy is misleading—it can achieve this by simply predicting the majority class (non-churner) for all samples. Recall measures the proportion of actual churners correctly identified (True Positives / (True Positives + False Negatives)), directly addressing the business need to detect churn. A high recall ensures the model captures most churners, even at the cost of some false positives.

Exam trap

Cisco often tests the misconception that high accuracy always indicates a good model, especially in imbalanced datasets, leading candidates to overlook metrics like recall or precision that better reflect model utility for the specific business problem.

How to eliminate wrong answers

Option A is wrong because accuracy is a poor metric for imbalanced datasets; a model that predicts all samples as the majority class can achieve high accuracy (95% here) while failing to identify any churners, making it useless for the business goal. Option B is wrong because R-squared is a metric for regression models, measuring the proportion of variance explained by the independent variables, and is not applicable to binary classification tasks like churn prediction. Option C is wrong because precision (True Positives / (True Positives + False Positives)) focuses on the correctness of positive predictions; while important, it does not capture the model's ability to find all churners—a model with high precision but low recall might still miss most churners, which is the core issue reported by the business team.

62
MCQhard

A generative AI application occasionally produces factually incorrect responses. The team has already tried prompt engineering and increasing the temperature parameter. Which next step is MOST effective to improve factual accuracy?

A.Use a larger foundation model
B.Fine-tune the model on company data
C.Reduce the temperature to 0
D.Implement a Retrieval Augmented Generation (RAG) pipeline
AnswerD

RAG retrieves relevant documents to ground the model's responses, improving factual accuracy.

Why this answer

Option D is correct because Retrieval Augmented Generation (RAG) provides external knowledge to ground responses. Option A (larger model) may not fix factual errors. Option B (lower temperature) can reduce randomness but not correct false facts.

Option C (fine-tuning with company data) could help but requires curated dataset; RAG is more direct for factual accuracy.

63
MCQhard

A healthcare company needs to use Amazon SageMaker Ground Truth for data labeling. The data includes protected health information (PHI) that must remain in the US. Which configuration meets the compliance requirements?

A.Use a vendor-managed workforce and set up data encryption
B.Use a private workforce consisting of the company's employees and launch the labeling job in the us-east-1 region
C.Use a public workforce (Mechanical Turk) and select the US East region
D.Use a private workforce but launch the labeling job in the eu-west-1 region
AnswerB

Private workforce ensures data is handled by employees under the company's control, and region choice ensures data residency.

Why this answer

Option B is correct because a private workforce consisting of the company's own employees ensures that PHI is never exposed to external workers, and launching the labeling job in us-east-1 keeps all data within the US, satisfying the data residency requirement. Amazon SageMaker Ground Truth allows you to restrict data access to a private workforce that you manage, and by selecting a US region, you ensure that data processing and storage remain within US borders.

Exam trap

The trap here is that candidates often assume that selecting a US region with a public workforce is sufficient for compliance, overlooking that PHI must not be exposed to external workers regardless of geographic location.

How to eliminate wrong answers

Option A is wrong because a vendor-managed workforce involves third-party vendors who may not have the same compliance controls for PHI, and data encryption alone does not guarantee that data remains within the US. Option C is wrong because a public workforce (Mechanical Turk) exposes PHI to anonymous external workers, which violates HIPAA and data privacy requirements, even if the region is set to US East. Option D is wrong because launching the labeling job in eu-west-1 (Ireland) violates the requirement that PHI must remain in the US, as data would be processed and stored in the European Union.

64
MCQmedium

A media company is using Amazon Bedrock to generate captions for images. They have a batch processing pipeline that sends thousands of images daily to the Bedrock API using the Titan Image Generator G1 model. Recently, they started receiving ThrottlingException errors during peak hours. The team needs to process all images within 24 hours without changing the model or the application code. The current account has a default quota of 10 requests per second (RPS) for the Titan model in us-east-1. The team estimates they need 50 RPS during peak hours. They have already implemented exponential backoff in the client, but the errors persist. What is the MOST effective solution to resolve the throttling issue?

A.Request a service quota increase for the InvokeModel API for the Titan model in us-east-1
B.Use Amazon SageMaker batch transform to process images offline
C.Distribute the requests across multiple AWS Regions
D.Switch to a different foundation model that has a higher default quota
AnswerA

Increasing quota directly resolves throttling.

Why this answer

The team has already implemented exponential backoff, but the errors persist because their current quota of 10 RPS is insufficient for the required 50 RPS. Requesting a service quota increase for the InvokeModel API for the Titan Image Generator G1 model in us-east-1 directly addresses the root cause by raising the throughput limit, allowing the existing application code and model to handle the peak load without any architectural changes.

Exam trap

The trap here is that candidates may think exponential backoff or distributing across Regions solves all throttling, but the core issue is a hard service quota that must be increased, not a transient rate limit.

How to eliminate wrong answers

Option B is wrong because Amazon SageMaker batch transform is designed for offline inference on SageMaker endpoints, not for invoking Bedrock APIs; it would require changing the application code and infrastructure, which the question explicitly prohibits. Option C is wrong because distributing requests across multiple AWS Regions would require modifying the application code to route traffic to different endpoints, and it does not address the underlying quota issue in the primary region; it also introduces latency and complexity. Option D is wrong because switching to a different foundation model would require changing the application code and potentially the image generation logic, which is not allowed; moreover, other models may have different default quotas or capabilities, and the goal is to process images with the Titan model.

65
MCQmedium

A startup is using Amazon Bedrock to power a virtual assistant. They need to ensure that personally identifiable information (PII) is not included in the model's responses. Which feature should they enable?

A.Enable PII redaction in the Bedrock guardrails.
B.Enable model invocation logging.
C.Configure a VPC endpoint.
D.Enable data encryption at rest.
AnswerA

Guardrails can redact PII from prompts and completions.

Why this answer

Amazon Bedrock Guardrails provide a configurable content filtering and PII redaction feature that can automatically detect and mask personally identifiable information (PII) in model inputs and outputs. By enabling PII redaction within guardrails, the startup can ensure that sensitive data like names, addresses, or credit card numbers are removed or obfuscated before the virtual assistant's responses reach the user. This is the direct and intended mechanism for preventing PII leakage in model responses.

Exam trap

The trap here is that candidates often confuse data protection features like encryption or logging with content filtering, not realizing that PII redaction is a specific guardrail policy that actively modifies model outputs in real time.

How to eliminate wrong answers

Option B is wrong because model invocation logging captures metadata and request/response payloads for auditing and debugging, but it does not actively redact or filter PII from responses — it only records what was sent and received. Option C is wrong because configuring a VPC endpoint provides private network connectivity to Bedrock without traversing the public internet, but it has no capability to inspect or modify the content of model responses for PII. Option D is wrong because enabling data encryption at rest protects stored data (e.g., logs, model artifacts) from unauthorized access, but it does not perform real-time redaction of PII in model outputs during inference.

66
MCQeasy

A developer is building an application that generates product descriptions from images using a multimodal model. Which AWS service provides access to multimodal foundation models?

A.Amazon Rekognition
B.Amazon Textract
C.Amazon Comprehend
D.Amazon Bedrock
AnswerD

Bedrock provides access to foundation models, including multimodal models that can generate text from images.

Why this answer

Option B, Amazon Bedrock, offers access to multimodal models like Claude 3 that can process images and text. Option A (Rekognition) is for image and video analysis, not generation. Option C (Textract) extracts text from documents.

Option D (Comprehend) is for NLP.

67
MCQmedium

A machine learning engineer wants to ensure that a SageMaker notebook instance only has access to a specific S3 bucket containing training data. The notebook instance is in a VPC. What is the most secure way to restrict access?

A.Use a VPC endpoint for S3 and a bucket policy that restricts access to the VPC endpoint.
B.Place the notebook instance in a private subnet with a NAT gateway.
C.Use AWS KMS to encrypt the bucket and grant the notebook role decrypt permissions.
D.Assign an IAM role to the notebook with an S3 bucket policy that only allows access to that bucket.
AnswerA

Combines network-level and resource-based policy to enforce access only from the VPC.

Why this answer

Option A is correct because using a VPC endpoint for S3 combined with a bucket policy that restricts access to that specific endpoint ensures that only traffic originating from within the VPC (and thus from the SageMaker notebook instance) can reach the S3 bucket. This approach enforces network-level isolation and prevents access from any other source, including the public internet, even if the IAM role is compromised. It is the most secure method because it layers network policy (VPC endpoint) with resource-based policy (bucket policy) to create a tightly scoped access control.

Exam trap

The trap here is that candidates often think IAM roles and bucket policies alone are sufficient for security, but they overlook the need for network-level restrictions (like VPC endpoints) to prevent data exfiltration or unauthorized access from outside the VPC.

How to eliminate wrong answers

Option B is wrong because placing the notebook instance in a private subnet with a NAT gateway still allows outbound internet traffic through the NAT gateway, which does not restrict access to a specific S3 bucket; the notebook could potentially reach any S3 bucket or internet resource, and the NAT gateway does not enforce bucket-level restrictions. Option C is wrong because using AWS KMS to encrypt the bucket and granting the notebook role decrypt permissions only protects data at rest but does not restrict which S3 bucket the notebook can access; the notebook could still access any bucket if the IAM policy allows it. Option D is wrong because assigning an IAM role with a bucket policy that only allows access to that bucket is a necessary but insufficient security measure; it does not prevent the notebook from accessing the bucket from outside the VPC or via the public internet, and it lacks the network-level restriction provided by a VPC endpoint.

68
MCQhard

A financial services company needs to use a foundation model for sensitive data analysis. They require that all data remains within a VPC and no data leaves the AWS network. Which solution should they choose?

A.Use Amazon Comprehend with VPC.
B.Use Amazon Bedrock with a public endpoint.
C.Use Amazon Bedrock with a custom model and VPC endpoints.
D.Use Amazon SageMaker with a VPC-only real-time endpoint hosting a foundation model.
AnswerC

Bedrock custom models support VPC endpoints to keep data within the network.

Why this answer

Amazon Bedrock with a custom model and VPC endpoints ensures that all data remains within the VPC and never traverses the public internet, meeting the requirement for sensitive data analysis. VPC endpoints (AWS PrivateLink) allow private connectivity to Bedrock, and a custom model can be deployed within the VPC for inference, keeping data entirely within the AWS network.

Exam trap

The trap here is that candidates may confuse SageMaker's VPC hosting capabilities with Bedrock's managed service model, or assume that any AWS service with VPC support (like Comprehend) can serve as a foundation model solution, when in fact Bedrock is the only managed service designed for foundation model access with private VPC endpoints.

How to eliminate wrong answers

Option A is wrong because Amazon Comprehend is a natural language processing service that does not provide foundation model capabilities for generative AI tasks, and its VPC support only applies to data processing, not to hosting or invoking foundation models. Option B is wrong because using Amazon Bedrock with a public endpoint means data and requests travel over the public internet, violating the requirement that no data leaves the AWS network. Option D is wrong because Amazon SageMaker with a VPC-only real-time endpoint hosting a foundation model would require you to self-manage the model and infrastructure, which is not the recommended approach for using a managed foundation model service like Bedrock, and it does not leverage Bedrock's VPC endpoint integration for private access.

69
Multi-Selectmedium

Which TWO actions should a data scientist take to evaluate fairness of a binary classification model using Amazon SageMaker Clarify? (Choose two.)

Select 2 answers
A.Use post-training bias metrics like Difference in Positive Proportions
B.Ensure the training dataset is balanced by resampling
C.Generate SHAP values for feature importance
D.Use pre-training bias metrics such as Class Imbalance
E.Run a data quality monitoring job on unlabeled data
AnswersA, D

Post-training metrics compare predictions across groups.

Why this answer

Options A and D are correct. SageMaker Clarify computes pre-training bias metrics (like Class Imbalance) and post-training metrics (like Difference in Positive Proportions) to assess fairness. Option B is for explainability, not bias.

Option C is for unlabeled data, not model evaluation. Option E is a general practice but not specific to Clarify fairness evaluation.

70
MCQmedium

An application uses this configuration to enable RAG. What is required for the knowledge base to function?

A.The agent must have internet access to retrieve documents
B.The embedding model ARN must include the account ID
C.The embedding model must be fine-tuned on the domain data
D.The knowledge base must have a vector index configured in Amazon OpenSearch Serverless
AnswerD

A vector store is required to index and query the embedded documents.

Why this answer

Option A is correct because a vector index (e.g., in Amazon OpenSearch Serverless) is necessary to store and retrieve embeddings. Option B (fine-tune embedding model) is optional. Option C (internet access) is not needed.

Option D (account ID in ARN) is incorrect format.

71
MCQmedium

A company deployed a chatbot using Amazon Lex integrated with a Lambda function that invokes Claude on Amazon Bedrock. The Lambda function retrieves relevant documents from an Amazon Kendra index to use as context. Users report that the chatbot's responses are often irrelevant or incorrect despite the Kendra index containing accurate information. The logs show that the Lambda function is correctly passing retrieved documents to the model. What is the most likely cause and solution?

A.Switch to a larger foundation model like Claude 3 Opus
B.The model's temperature is set too high; reduce it to 0.1
C.The maximum tokens limit is too low; increase it to 4096
D.The chunking strategy for documents is too coarse or inappropriate; refine chunking and use semantic search in Kendra
AnswerD

Proper chunking ensures each chunk contains coherent information relevant to potential queries; Kendra's semantic search improves relevance.

Why this answer

The issue likely stems from the chunking and retrieval strategy. If the retrieved document chunks do not contain the exact answer or are poorly segmented, the model may not have the necessary context. Improving chunking to be more semantic and ensuring retrieval uses a relevant similarity metric (e.g., using Kendra's relevance tuning) would help.

Increasing temperature or reducing tokens would degrade quality. Switching model may not address the root cause.

72
MCQhard

A security team is concerned about adversarial attacks on their image classification model deployed on Amazon SageMaker. They want to test robustness against carefully crafted inputs that cause misclassification. What approach should they use?

A.Data augmentation on the training set
B.A/B testing between two similar models
C.SageMaker Model Monitor with adversarial drift
D.Generating adversarial examples using SageMaker Clarify
AnswerD

Clarify includes adversarial validation capabilities to test robustness.

Why this answer

Option D is correct because adversarial validation involves generating adversarial examples using attacks like FGSM to test model robustness. Model Monitor (A) does not test adversarial robustness. Data augmentation (B) improves generalization but is not a test.

A/B testing (C) compares models, not robustness.

73
Multi-Selectmedium

A company is building a generative AI application using Amazon Bedrock and needs to ensure that the model does not generate outputs containing personally identifiable information (PII). Which TWO actions should the company take? (Choose 2)

Select 2 answers
A.Implement a custom AWS Lambda function to scan and redact PII from inputs and outputs.
B.Use AWS Identity and Access Management (IAM) policies to restrict model access.
C.Enable Amazon CloudWatch Logs to capture and audit model outputs.
D.Configure Amazon Bedrock Guardrails to block or mask PII.
E.Place the Bedrock model endpoint within a private VPC.
AnswersA, D

Lambda can use PII detection libraries to filter sensitive data.

Why this answer

Option A is correct because a custom AWS Lambda function can be integrated into the application workflow to programmatically scan and redact PII from both inputs and outputs before they reach or leave the Bedrock model. This provides a flexible, code-driven approach to data sanitization, allowing the use of libraries like Amazon Comprehend or regex patterns to detect and mask PII entities such as names, addresses, and social security numbers.

Exam trap

Cisco often tests the distinction between network-level security controls (like VPCs) and content-level data protection mechanisms, leading candidates to mistakenly choose VPC isolation as a solution for PII redaction.

74
MCQhard

A company wants to use a large language model to generate code based on natural language descriptions. They need to minimize latency and control costs by running inference on their own infrastructure. Which approach is most suitable?

A.Use Amazon Bedrock API
B.Use Amazon SageMaker to deploy a custom LLM
C.Use Amazon Comprehend
D.Use Amazon Lex
AnswerB

SageMaker can deploy models on customer-specified instances, giving control over latency and cost.

Why this answer

Option B, using Amazon SageMaker to deploy a custom LLM, allows the company to run inference on their own infrastructure with controlled costs and latency. Option A (Amazon Bedrock) is a managed service and does not run on customer infrastructure. Option C (Amazon Lex) is for conversational bots, not code generation.

Option D (Amazon Comprehend) is for NLP tasks like sentiment analysis, not code generation.

75
MCQmedium

Refer to the exhibit. A data scientist runs an Amazon SageMaker Clarify bias analysis on a binary classifier. The pre-training ClassImbalance is 1.5 and the post-training DPPL is 0.15. What should the data scientist conclude?

A.The data is highly imbalanced and the model is unbiased.
B.The data has a mild class imbalance, but the model shows a noticeable bias in predictions.
C.The pre-training metric indicates a fairness issue, but the post-training metric is acceptable.
D.The data is perfectly balanced and the model is fair.
AnswerB

ClassImbalance of 1.5 is moderate; DPPL of 0.15 indicates a 15% difference, which is concerning.

Why this answer

Option B is correct. A ClassImbalance of 1.5 indicates the majority class is 1.5x the minority, mild imbalance. A DPPL of 0.15 indicates a 15% difference in positive prediction rates between groups, which is a significant fairness concern.

Option A misinterprets both; C is wrong because bias is present; D confuses the metrics.

Page 1 of 7

Page 2

All pages