CCNA Applications of Foundation Models Questions — Page 1 of 2

Multi-Selecthard

Which THREE practices are recommended for responsible AI when deploying foundation models? (Choose three.)

Select 3 answers

A.Avoid collecting user feedback to reduce bias

B.Include human review for high-stakes decisions

C.Implement guardrails to filter harmful content

D.Continuously monitor model outputs for drift

E.Use a black box approach to keep model internals secret

AnswersB, C, D

Human oversight is crucial for critical applications.

Why this answer

Options A, B, and D are correct. Implementing guardrails helps prevent harmful outputs. Monitoring for drift ensures the model remains safe over time.

Human review for critical decisions adds oversight. Option C (avoiding feedback) is irresponsible. Option E (black box approach) is not recommended.

Practice this question →

MCQhard

A company uses Amazon Bedrock with a custom model deployed via Amazon SageMaker. They want to monitor for data drift in input prompts over time. Which AWS service is best suited for this?

A.Amazon CloudWatch

B.Amazon SageMaker Model Monitor

C.AWS CloudTrail

D.Amazon Athena

AnswerB

Model Monitor can be configured to capture input data and detect drift using statistical methods.

Why this answer

Amazon SageMaker Model Monitor is the correct choice because it is specifically designed to detect data drift in machine learning models, including input prompts for custom models deployed via SageMaker. It continuously monitors the distribution of input data against a baseline and alerts when drift occurs, which aligns with the requirement to monitor input prompts over time.

Exam trap

The trap here is that candidates often confuse general monitoring services like CloudWatch with specialized ML monitoring tools, assuming CloudWatch can handle data drift detection when it actually lacks the statistical analysis required for such tasks.

How to eliminate wrong answers

Option A is wrong because Amazon CloudWatch is a monitoring service for AWS resources and applications (e.g., metrics, logs, alarms), but it does not have built-in capabilities to detect data drift in ML model inputs. Option C is wrong because AWS CloudTrail records API activity for auditing and governance, not for monitoring data drift in model inputs. Option D is wrong because Amazon Athena is an interactive query service for analyzing data in S3 using SQL, not a monitoring tool for data drift.

Practice this question →

MCQhard

A company is building a multi-modal application that processes images and text to answer questions about product defects. Which foundation model approach is BEST?

A.Use an image captioning model and then analyze the caption text

B.Use a text-to-image generation model and analyze the generated image

C.Use a multi-modal foundation model that processes both images and text

D.Use a separate image analysis model and a text model, then combine outputs

AnswerC

Multi-modal models are designed for joint understanding of images and text.

Why this answer

Option C is correct because multi-modal foundation models (e.g., CLIP, Flamingo, GPT-4V) are specifically designed to jointly process and reason over images and text in a unified architecture. This allows the model to directly correlate visual defects with textual descriptions without intermediate lossy transformations, making it the most effective approach for a multi-modal QA task.

Exam trap

AWS often tests the misconception that combining two separate single-modal models (Option D) is equivalent to a true multi-modal model, but the trap is that late fusion lacks the joint embedding and cross-attention mechanisms needed for coherent multi-modal reasoning.

How to eliminate wrong answers

Option A is wrong because image captioning models convert the entire image into a single text caption, losing fine-grained spatial and defect-specific details that are critical for accurate defect analysis. Option B is wrong because text-to-image generation models create new images from text, which is the inverse of the required task and cannot analyze existing product images for defects. Option D is wrong because using separate models and combining outputs introduces a late-fusion bottleneck, where alignment between visual features and text is not learned end-to-end, leading to poorer performance on tasks requiring joint reasoning.

Practice this question →

MCQhard

Refer to the exhibit. You are trying to invoke a foundation model via Amazon Bedrock but receive this error. What should you do to resolve it?

A.Increase the service quota for Bedrock

B.Request model access in the Bedrock console

C.Attach the AmazonBedrockFullAccess IAM policy

D.Use a different AWS Region

AnswerB

Model access must be requested and approved before use.

Why this answer

The error indicates that the user has not been granted access to the specific foundation model in Amazon Bedrock. Even with valid IAM permissions, each model requires explicit access approval via the Bedrock console's 'Model access' section. Option B is correct because requesting model access there provisions the necessary service-level authorization.

Exam trap

AWS often tests the distinction between IAM permissions and service-level model access, trapping candidates who assume that a full-access IAM policy automatically grants access to all foundation models.

How to eliminate wrong answers

Option A is wrong because increasing the service quota addresses limits on concurrent invocations or throughput, not the 'access denied' error for a foundation model. Option C is wrong because the AmazonBedrockFullAccess IAM policy grants permissions to use Bedrock APIs, but it does not grant access to specific foundation models; model access is a separate approval process. Option D is wrong because the error is not region-specific; model access must be requested in each region where you intend to use the model, and changing regions without requesting access would produce the same error.

Practice this question →

MCQeasy

A company wants to build a chatbot that responds to customer queries using a foundation model. They need low latency and want to avoid managing infrastructure. Which AWS service should they use?

A.Amazon EC2

B.AWS Lambda

C.Amazon Bedrock

D.Amazon SageMaker

AnswerC

Correct: Bedrock is serverless and provides API access to foundation models.

Why this answer

Amazon Bedrock is a fully managed service that provides access to foundation models (FMs) from leading AI providers via a simple API, eliminating the need to manage underlying infrastructure. It is designed for building generative AI applications like chatbots with low latency, as it handles model hosting, scaling, and inference optimization automatically. This makes it the ideal choice for the company's requirement of low-latency responses without infrastructure management.

Exam trap

AWS often tests the misconception that AWS Lambda can handle any serverless workload, but candidates must recognize that Lambda is unsuitable for large model inference due to its execution time, memory, and GPU limitations, whereas Bedrock is purpose-built for foundation model access.

How to eliminate wrong answers

Option A is wrong because Amazon EC2 requires you to provision, configure, and manage virtual servers, including installing and maintaining the foundation model and its dependencies, which contradicts the requirement to avoid managing infrastructure. Option B is wrong because AWS Lambda is a serverless compute service for running short-duration code (up to 15 minutes) and is not designed to host large foundation models; it lacks the GPU support and memory capacity needed for model inference. Option D is wrong because Amazon SageMaker is a machine learning platform that requires you to manage endpoints, instances, and scaling for model deployment, which still involves infrastructure management and does not provide the fully managed, API-based access to foundation models that Bedrock offers.

Practice this question →

MCQhard

A company wants to adapt a foundation model for a custom domain with very limited labeled data and minimal cost. Which approach is most suitable?

A.Pre-training from scratch

B.Prompt engineering with few-shot examples

C.Reinforcement learning from human feedback

D.Full fine-tuning

AnswerB

This provides in-context learning with no training cost.

Why this answer

Prompt engineering with few-shot examples allows the model to learn from context without expensive fine-tuning, ideal for small datasets.

Practice this question →

MCQhard

Refer to the exhibit. An IAM policy is attached to a user. Which models can the user invoke?

A.Only Claude v2

B.No models

C.Claude v2 and any model with a name containing 'claude'

D.Any model in the account

AnswerA

The Allow grants access to Claude v2; the Deny using NotResource denies everything else.

Why this answer

The IAM policy explicitly allows the `bedrock:InvokeModel` action only on the resource ARN `arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2`. This means the user can invoke only the Claude v2 model. No other models, including other Claude versions or any model with 'claude' in its name, are permitted because the resource ARN is exact and does not use wildcards.

Exam trap

Cisco often tests the distinction between an exact resource ARN and a wildcard pattern; candidates mistakenly assume that 'claude' in the model ID implies all Claude models are allowed, but without a wildcard, only the exact model specified is permitted.

How to eliminate wrong answers

Option B is wrong because the policy does allow invocation of Claude v2, so the user can invoke at least one model. Option C is wrong because the policy uses an exact resource ARN (`anthropic.claude-v2`), not a wildcard pattern like `*claude*`; models with 'claude' in their name but not exactly `claude-v2` are not allowed. Option D is wrong because the policy restricts invocation to a single specific model, not any model in the account.

Practice this question →

MCQhard

A financial services company is deploying a foundation model on Amazon Bedrock to generate compliance reports from internal audit logs. The model must not output any personally identifiable information (PII). They have configured a Bedrock Guardrail with sensitive information filters set to the 'HIGH' sensitivity level. During testing in a staging environment, testers still observed PII being occasionally generated in the report outputs. The guardrail did not block these instances because the PII was embedded in a context that the guardrail's pattern matching did not catch (e.g., structured JSON data with embedded names). The company requires a solution that minimizes latency and cost, as they process thousands of reports daily. They cannot afford to increase inference time significantly due to strict SLAs. They also want to avoid re-engineering the entire solution. Which additional step should they take to effectively eliminate PII leakage while maintaining performance?

A.Add a prompt instruction to the model to never output PII, with few-shot examples of non-PII outputs.

B.Fine-tune the foundation model on a dataset that excludes PII.

C.Increase the guardrail sensitivity to 'MAXIMUM'.

D.Implement a post-processing Lambda function that uses Amazon Comprehend's PII detection to scan and redact any PII from the model output before returning it.

AnswerD

Correct: Amazon Comprehend provides robust PII detection that can catch context-based PII. The Lambda function can be optimized for low latency and added cost is minimal.

Why this answer

Option C adds a dedicated PII detection layer using Amazon Comprehend, which is accurate and can be optimized for latency. Option A may increase false positives and misses. Option B is expensive.

Option D is unreliable.

Practice this question →

MCQhard

A data science team fine-tuned a foundation model on Amazon SageMaker for sentiment analysis of customer reviews. They deployed the model as a real-time endpoint. After a successful launch, the application experienced a surge in traffic, and the endpoint's latency increased from 200ms to over 2 seconds. The team needs to reduce latency and maintain high throughput without increasing costs significantly. They are using a single ml.g5.xlarge instance. What change should the team make first?

A.Enable automatic scaling for the endpoint.

B.Increase the instance type to ml.g5.4xlarge.

C.Compile the model using SageMaker Neo to optimize for inference.

D.Switch to a batch transform job instead of real-time.

AnswerC

Neo optimizes models for specific hardware, improving speed without additional cost.

Why this answer

Option C is correct because compiling the model with SageMaker Neo optimizes the model for the target hardware, significantly reducing inference latency without increasing compute cost. Option A (upgrade instance) increases cost. Option B (switch to batch) is not suitable for real-time.

Option D (auto scaling) adds instances but does not reduce per-request latency; it may increase cost.

Practice this question →

MCQeasy

A company is using Amazon Bedrock to generate code snippets. Developers report that the generated code sometimes contains security vulnerabilities. Which action should the team take to mitigate this risk?

A.Deploy the model in a sandbox environment to limit its access to sensitive systems.

B.Implement a manual code review process after generation.

C.Add a system prompt that instructs the model to follow security best practices and avoid known vulnerabilities.

D.Reduce the temperature parameter to 0 to make the output deterministic.

AnswerC

A system prompt sets expectations and can reduce the likelihood of insecure code generation.

Why this answer

Option C is correct because adding a system prompt that instructs the model to follow security best practices and avoid known vulnerabilities directly influences the model's output at inference time. Amazon Bedrock supports system prompts that act as high-level instructions to guide the foundation model's behavior, making this a proactive, scalable mitigation that does not require manual intervention or architectural changes.

Exam trap

AWS often tests the misconception that reducing temperature or isolating the environment can fix output quality issues, when in fact only prompt-level guidance directly addresses the model's generation behavior.

How to eliminate wrong answers

Option A is wrong because deploying the model in a sandbox environment limits access to sensitive systems but does not prevent the model from generating code with security vulnerabilities; the model's output itself remains unchanged. Option B is wrong because implementing a manual code review process after generation is a reactive measure that does not reduce the risk at the source; it adds human overhead and delays, and is not a mitigation that addresses the model's tendency to produce insecure code. Option D is wrong because reducing the temperature parameter to 0 makes the output deterministic but does not teach or enforce security best practices; it only reduces randomness, not the likelihood of generating vulnerable patterns.

Practice this question →

MCQeasy

A company wants to use a foundation model to automatically moderate user-generated content. The model must filter out inappropriate content with high accuracy. Which Amazon service is best suited for this task?

A.Amazon Translate

B.Amazon Rekognition

C.Amazon Polly

D.Amazon Comprehend

AnswerD

Comprehend offers content moderation features.

Why this answer

Amazon Comprehend is the correct choice because it is a natural language processing (NLP) service that can analyze text for sentiment, key phrases, and — critically — toxicity and inappropriate content using built-in or custom classifiers. This directly matches the requirement to moderate user-generated text with high accuracy, as it can detect hate speech, profanity, and other harmful language.

Exam trap

The trap here is that candidates may confuse Amazon Rekognition's ability to detect unsafe content in images with the need to moderate text, leading them to select Rekognition instead of recognizing that Comprehend is the NLP service for text analysis.

How to eliminate wrong answers

Option A is wrong because Amazon Translate is a machine translation service that converts text between languages; it has no capability to analyze or moderate content for appropriateness. Option B is wrong because Amazon Rekognition is designed for image and video analysis (e.g., object detection, facial recognition, unsafe content detection in visual media), not for moderating text-based user-generated content. Option C is wrong because Amazon Polly is a text-to-speech service that converts written text into lifelike speech; it performs no content moderation or analysis.

Practice this question →

MCQeasy

A startup needs to build a real-time text translation feature for a customer chat application. Latency must be under 200 ms per request. Which AWS approach is BEST suited?

A.Use Amazon Comprehend for language detection and a custom translation model

B.Use Amazon Bedrock with a multilingual foundation model

C.Use Amazon Translate with real-time translation

D.Use Amazon Transcribe and then Amazon Bedrock

AnswerC

Amazon Translate is a purpose-built service for translation with low latency.

Why this answer

Amazon Translate's real-time translation API is purpose-built for low-latency text translation, typically achieving sub-200 ms response times for small payloads. It directly translates text without the overhead of running a large foundation model or performing intermediate steps like transcription, making it the best fit for this latency-sensitive chat application.

Exam trap

The trap here is that candidates may assume a large foundation model (e.g., via Bedrock) is always the best choice for multilingual tasks, overlooking that purpose-built services like Amazon Translate are specifically optimized for low-latency, high-throughput translation at a fraction of the cost and complexity.

How to eliminate wrong answers

Option A is wrong because Amazon Comprehend is designed for natural language processing (e.g., sentiment analysis, entity extraction), not real-time translation, and building a custom translation model would introduce significant latency and complexity. Option B is wrong because Amazon Bedrock with a multilingual foundation model introduces inference latency that often exceeds 200 ms for real-time requests, and it is not optimized for the single-purpose, high-throughput translation task required here. Option D is wrong because Amazon Transcribe is for speech-to-text, not text translation, and chaining it with Bedrock adds unnecessary latency and complexity for a text-only translation feature.

Practice this question →

Multi-Selectmedium

A company is building a chatbot using Amazon Bedrock. They want to ensure the model's responses are grounded in company-specific data and that harmful content is filtered out. Which two services or features should they use? (Choose TWO.)

Select 2 answers

A.Amazon Kendra

B.Bedrock Agents

C.Amazon Comprehend

D.Bedrock Guardrails

E.Amazon SageMaker JumpStart

AnswersA, D

Correct: Amazon Kendra can be used as a knowledge base for RAG to ground responses in company data.

Why this answer

Amazon Kendra is correct because it provides a managed search service that indexes company-specific data sources, enabling the Bedrock chatbot to retrieve relevant documents and ground its responses in authoritative information. Bedrock Guardrails is correct because it allows you to define content filters and topic policies to block harmful or undesirable outputs, ensuring the chatbot adheres to safety and compliance requirements.

Exam trap

AWS often tests the distinction between services that provide grounding (Kendra) versus those that orchestrate actions (Agents), and between content filtering (Guardrails) versus general NLP (Comprehend), leading candidates to confuse the roles of Bedrock Agents and Amazon Comprehend.

Practice this question →

MCQeasy

A data scientist wants to quickly experiment with a pre-trained LLM for text generation without writing any code. Which AWS service is MOST suitable?

A.Amazon Bedrock

B.Amazon EC2

C.Amazon SageMaker JumpStart

D.AWS Lambda

AnswerC

SageMaker JumpStart offers pre-trained models with a simple deployment interface.

Why this answer

Amazon SageMaker JumpStart provides a curated set of pre-trained foundation models (including LLMs) that can be deployed with just a few clicks, requiring no code. This makes it the most suitable service for a data scientist who wants to quickly experiment with a pre-trained LLM for text generation without writing any code.

Exam trap

The trap here is that candidates may confuse Amazon Bedrock's managed API access to foundation models with a no-code solution, but Bedrock still requires code to call the API, whereas SageMaker JumpStart offers a true no-code deployment and testing interface.

How to eliminate wrong answers

Option A is wrong because Amazon Bedrock is a serverless service for building generative AI applications using foundation models via API calls, but it still requires at least minimal code (e.g., SDK or CLI) to invoke the model, not a no-code experiment. Option B is wrong because Amazon EC2 requires manual setup of the OS, environment, and model deployment, which involves significant code and configuration, not a quick no-code experiment. Option D is wrong because AWS Lambda is a serverless compute service for running code in response to events, and it requires writing and deploying code to invoke an LLM, not a no-code solution.

Practice this question →

MCQhard

A generative AI application occasionally produces factually incorrect responses. The team has already tried prompt engineering and increasing the temperature parameter. Which next step is MOST effective to improve factual accuracy?

A.Use a larger foundation model

B.Fine-tune the model on company data

C.Reduce the temperature to 0

D.Implement a Retrieval Augmented Generation (RAG) pipeline

AnswerD

RAG retrieves relevant documents to ground the model's responses, improving factual accuracy.

Why this answer

Option D is correct because Retrieval Augmented Generation (RAG) provides external knowledge to ground responses. Option A (larger model) may not fix factual errors. Option B (lower temperature) can reduce randomness but not correct false facts.

Option C (fine-tuning with company data) could help but requires curated dataset; RAG is more direct for factual accuracy.

Practice this question →

MCQmedium

A startup is using Amazon Bedrock to power a virtual assistant. They need to ensure that personally identifiable information (PII) is not included in the model's responses. Which feature should they enable?

A.Enable PII redaction in the Bedrock guardrails.

B.Enable model invocation logging.

C.Configure a VPC endpoint.

D.Enable data encryption at rest.

AnswerA

Guardrails can redact PII from prompts and completions.

Why this answer

Amazon Bedrock Guardrails provide a configurable content filtering and PII redaction feature that can automatically detect and mask personally identifiable information (PII) in model inputs and outputs. By enabling PII redaction within guardrails, the startup can ensure that sensitive data like names, addresses, or credit card numbers are removed or obfuscated before the virtual assistant's responses reach the user. This is the direct and intended mechanism for preventing PII leakage in model responses.

Exam trap

The trap here is that candidates often confuse data protection features like encryption or logging with content filtering, not realizing that PII redaction is a specific guardrail policy that actively modifies model outputs in real time.

How to eliminate wrong answers

Option B is wrong because model invocation logging captures metadata and request/response payloads for auditing and debugging, but it does not actively redact or filter PII from responses — it only records what was sent and received. Option C is wrong because configuring a VPC endpoint provides private network connectivity to Bedrock without traversing the public internet, but it has no capability to inspect or modify the content of model responses for PII. Option D is wrong because enabling data encryption at rest protects stored data (e.g., logs, model artifacts) from unauthorized access, but it does not perform real-time redaction of PII in model outputs during inference.

Practice this question →

MCQhard

A financial services company needs to use a foundation model for sensitive data analysis. They require that all data remains within a VPC and no data leaves the AWS network. Which solution should they choose?

A.Use Amazon Comprehend with VPC.

B.Use Amazon Bedrock with a public endpoint.

C.Use Amazon Bedrock with a custom model and VPC endpoints.

D.Use Amazon SageMaker with a VPC-only real-time endpoint hosting a foundation model.

AnswerC

Bedrock custom models support VPC endpoints to keep data within the network.

Why this answer

Amazon Bedrock with a custom model and VPC endpoints ensures that all data remains within the VPC and never traverses the public internet, meeting the requirement for sensitive data analysis. VPC endpoints (AWS PrivateLink) allow private connectivity to Bedrock, and a custom model can be deployed within the VPC for inference, keeping data entirely within the AWS network.

Exam trap

The trap here is that candidates may confuse SageMaker's VPC hosting capabilities with Bedrock's managed service model, or assume that any AWS service with VPC support (like Comprehend) can serve as a foundation model solution, when in fact Bedrock is the only managed service designed for foundation model access with private VPC endpoints.

How to eliminate wrong answers

Option A is wrong because Amazon Comprehend is a natural language processing service that does not provide foundation model capabilities for generative AI tasks, and its VPC support only applies to data processing, not to hosting or invoking foundation models. Option B is wrong because using Amazon Bedrock with a public endpoint means data and requests travel over the public internet, violating the requirement that no data leaves the AWS network. Option D is wrong because Amazon SageMaker with a VPC-only real-time endpoint hosting a foundation model would require you to self-manage the model and infrastructure, which is not the recommended approach for using a managed foundation model service like Bedrock, and it does not leverage Bedrock's VPC endpoint integration for private access.

Practice this question →

MCQmedium

Refer to the exhibit. You receive this response from Amazon Bedrock. What is the most likely cause of the incomplete information?

A.The max_tokens limit was reached

B.The prompt was too short

C.The temperature was too high

D.The model lacks knowledge about capitals

AnswerA

stop_reason: max_tokens indicates the output was capped by the token limit.

Why this answer

The response from Amazon Bedrock shows an incomplete sentence that cuts off mid-thought, which is a classic symptom of hitting the max_tokens limit. When the generated output reaches the specified maximum number of tokens, the model stops generating immediately, resulting in truncated text. This is the most likely cause because the output is syntactically incomplete but otherwise coherent up to the cutoff point.

Exam trap

AWS often tests the distinction between output truncation (max_tokens) and output quality issues (temperature, prompt engineering), so the trap here is that candidates may incorrectly attribute a truncated response to model ignorance or randomness rather than the explicit token limit.

How to eliminate wrong answers

Option B is wrong because the prompt length does not directly cause incomplete output; a short prompt can still produce a complete response if the max_tokens limit is high enough. Option C is wrong because temperature controls randomness and creativity, not the length or truncation of the output; high temperature might produce less coherent text but would not cut off mid-sentence. Option D is wrong because the model's lack of knowledge about capitals would result in incorrect or hallucinated information, not a truncated or incomplete sentence.

Practice this question →

Multi-Selecteasy

Which TWO of the following are benefits of using Amazon Bedrock for building applications with foundation models?

Select 2 answers

A.No infrastructure management

B.Automatic model fine-tuning

C.Access to multiple foundation models

D.Free tier for all models

E.Built-in image generation capability

AnswersA, C

Bedrock is serverless; AWS handles the underlying infrastructure.

Why this answer

Amazon Bedrock is a fully managed service that abstracts away the underlying infrastructure required to host and run foundation models (FMs). By using Bedrock, you do not need to provision, configure, or manage servers, GPUs, or scaling policies, which is a key benefit for developers who want to focus on building applications rather than managing infrastructure. Additionally, Bedrock provides a single API to access multiple FMs from providers like AI21 Labs, Anthropic, Cohere, Meta, and Stability AI, enabling you to choose the best model for your use case without managing separate endpoints or integrations.

Exam trap

AWS often tests the misconception that Amazon Bedrock includes built-in capabilities like automatic fine-tuning or image generation, when in reality these are model-specific features that you must explicitly select and configure, not inherent service features.

Practice this question →

MCQhard

A company is using Amazon Bedrock to generate product descriptions. They notice that the model sometimes produces descriptions that contain factual errors about the products. Which TWO actions should they take to improve factual accuracy?

A.Implement Retrieval Augmented Generation (RAG) with a product knowledge base

B.Reduce the temperature parameter to 0.1

C.Use a curated prompt with few-shot examples of accurate descriptions

D.Increase the max_tokens to allow longer descriptions

E.Use human reviewers to correct errors after generation

AnswerA, C

RAG provides current, accurate information to the model.

Why this answer

Option A is correct because Retrieval Augmented Generation (RAG) grounds the model's output in a curated product knowledge base, allowing it to retrieve and cite authoritative facts during generation. This directly reduces hallucinations by ensuring the model references verified data rather than relying solely on its parametric memory.

Exam trap

Cisco often tests the misconception that tuning generation parameters (like temperature or max_tokens) can fix factual accuracy, when in reality only grounding techniques like RAG or curated few-shot examples address the underlying hallucination problem.

How to eliminate wrong answers

Option B is wrong because reducing the temperature parameter to 0.1 makes the model more deterministic and repetitive, but it does not introduce factual grounding—it only reduces randomness, which can still produce plausible-sounding but incorrect facts. Option D is wrong because increasing max_tokens allows longer descriptions but does not improve factual accuracy; it may even increase the chance of generating more hallucinated content. Option E is wrong because human reviewers after generation is a validation step, not a method to improve the model's factual accuracy at inference time; it adds latency and cost without addressing the root cause of factual errors.

Practice this question →

MCQmedium

A development team uses a foundation model via Amazon Bedrock to generate code snippets. They notice that the model sometimes produces code with security vulnerabilities, such as SQL injection. The team wants to reduce these occurrences without delaying project timelines. What should they do?

A.Manually review all generated code before deployment.

B.Switch to a smaller, specialized code generation model.

C.Fine-tune the model on a dataset of secure code examples.

D.Use Amazon Bedrock's guardrails to filter insecure code.

AnswerC

Fine-tuning directly teaches the model to follow secure coding patterns.

Why this answer

Option B is correct because fine-tuning the model on a curated dataset of secure code examples teaches the model to generate safer code. Option A (switch to smaller model) may not address security specifically. Option C (use Bedrock guardrails) is for content filtering, not code analysis.

Option D (manual review) is time-consuming and does not reduce occurrence rate.

Practice this question →

MCQmedium

A developer is using Amazon Bedrock with the Claude model for text summarization. The output sometimes includes inaccurate information. What is the best practice to reduce hallucinations?

A.Use a larger model

B.Increase temperature

C.Use retrieval augmented generation

D.Decrease max tokens

AnswerC

RAG provides the model with relevant context from a knowledge base, improving factual accuracy.

Why this answer

Retrieval Augmented Generation (RAG) grounds the model with factual data from a knowledge base, reducing hallucinations. Increasing temperature (A) may increase randomness. Using a larger model (C) does not guarantee accuracy.

Decreasing max tokens (D) might truncate output but not address factual accuracy.

Practice this question →

MCQmedium

A developer is building a RAG-based Q&A bot with Amazon Bedrock Knowledge Bases. They need a managed vector store for document embeddings. Which service should they use?

A.Amazon OpenSearch Serverless

B.Amazon DynamoDB

C.Amazon RDS

D.Amazon S3

AnswerA

OpenSearch Serverless with k-NN plugin provides managed vector storage.

Why this answer

Amazon Bedrock Knowledge Bases requires a vector store to store and query document embeddings for Retrieval-Augmented Generation (RAG). Amazon OpenSearch Serverless provides a managed, scalable vector engine that supports k-NN (k-nearest neighbor) search, making it the correct choice for this use case. It integrates natively with Bedrock Knowledge Bases to handle embedding storage and similarity search without manual infrastructure management.

Exam trap

The trap here is that candidates may confuse Amazon DynamoDB or Amazon RDS as viable options because they can store data, but they lack native vector search capabilities required for RAG, leading to an incorrect choice.

How to eliminate wrong answers

Option B (Amazon DynamoDB) is wrong because it is a key-value and document database that does not natively support vector similarity search or k-NN indexing, making it unsuitable as a vector store for RAG. Option C (Amazon RDS) is wrong because it is a relational database service that lacks built-in vector search capabilities; while extensions like pgvector for PostgreSQL exist, Amazon RDS is not a managed vector store and would require custom implementation. Option D (Amazon S3) is wrong because it is an object storage service that cannot perform vector similarity queries; it can store raw documents but not embeddings in a searchable vector index.

Practice this question →

MCQhard

A healthcare company needs to use a foundation model for analyzing medical records while complying with HIPAA. They plan to use Amazon Bedrock. What should they do to meet HIPAA requirements?

A.Use a model that is HIPAA eligible in a region that supports BAA

B.Implement access logging for all API calls

C.Encrypt data at rest and in transit

D.All of the above

AnswerD

All three are required for HIPAA compliance with Bedrock.

Why this answer

Option D is correct because HIPAA compliance in Amazon Bedrock requires a combination of controls: using a HIPAA-eligible model in a region where AWS offers a Business Associate Addendum (BAA), enabling access logging for auditability, and encrypting data at rest and in transit. None of the individual options alone satisfy all HIPAA requirements; only the full set of controls ensures compliance.

Exam trap

The trap here is that candidates often pick a single security control (like encryption or logging) thinking it alone ensures HIPAA compliance, but the exam tests that HIPAA requires a combination of administrative, physical, and technical safeguards, all of which must be addressed.

How to eliminate wrong answers

Option A is wrong because while using a HIPAA-eligible model in a BAA-supported region is necessary, it does not address audit logging or encryption requirements. Option B is wrong because access logging alone provides audit trails but does not ensure the model is HIPAA-eligible or that data encryption is enforced. Option C is wrong because encrypting data at rest and in transit is critical but does not cover the need for a BAA or access logging.

All three are required together.

Practice this question →

Multi-Selecteasy

Which TWO of the following are benefits of using Amazon Bedrock for foundation models compared to managing your own infrastructure? (Select TWO.)

Select 2 answers

A.Higher throughput for custom models

B.Built-in content moderation

C.Access to multiple foundation models

D.Serverless experience

E.Full control over model weights

AnswersC, D

Bedrock offers various models from different providers.

Why this answer

Amazon Bedrock provides a serverless experience (A) so you don't manage infrastructure, and it offers access to multiple foundation models (C) from a single API. Full control over model weights (B) is not possible as Bedrock is managed. Higher throughput (D) is not guaranteed.

Built-in content moderation (E) is a feature but not a primary benefit over managed infrastructure.

Practice this question →

MCQhard

A legal firm wants to use a foundation model to extract key clauses from thousands of contracts. Accuracy is critical, and the model must not hallucinate or fabricate information. The firm has a large internal database of labeled contracts. Which approach should they take?

A.Use a smaller model specifically designed for legal text.

B.Fine-tune the model on the labeled contracts using Amazon Bedrock's fine-tuning capability.

C.Use a pre-trained model with detailed prompts and few-shot examples.

D.Implement Retrieval-Augmented Generation (RAG) using Amazon Bedrock and a vector store of contract clauses.

AnswerD

RAG retrieves relevant clauses to provide context, minimizing fabrication.

Why this answer

Option A is correct because Retrieval-Augmented Generation (RAG) grounds model outputs in retrieved relevant documents, reducing hallucinations. Option B (fine-tuning) may still hallucinate on unseen clauses. Option C (pre-trained with few-shot) lacks grounding.

Option D (specialized model) may not have sufficient accuracy without retrieval.

Practice this question →

MCQhard

A company uses Amazon Bedrock to generate product descriptions. They need to ensure outputs do not contain offensive language. Which service should they integrate to filter content?

A.Amazon Comprehend

B.Amazon Rekognition

C.Bedrock Guardrails

D.AWS WAF

AnswerC

Guardrails offers configurable content filters for safety and compliance.

Why this answer

Amazon Bedrock Guardrails is the correct choice because it is specifically designed to enforce content policies for foundation model outputs, including filtering for offensive language, hate speech, and other harmful content. It integrates directly with Bedrock to apply customizable safety filters and deny topics without requiring additional services or custom code.

Exam trap

The trap here is that candidates often confuse Amazon Comprehend's text analysis capabilities (like sentiment detection) with real-time content filtering, but Comprehend lacks the policy enforcement and integration with Bedrock that Guardrails provides.

How to eliminate wrong answers

Option A is wrong because Amazon Comprehend is a natural language processing (NLP) service for extracting insights like sentiment, entities, and key phrases from text, but it does not provide real-time content filtering or policy enforcement for Bedrock outputs. Option B is wrong because Amazon Rekognition is an image and video analysis service that detects objects, faces, and text in visual media, not a text-based content filter for offensive language. Option D is wrong because AWS WAF is a web application firewall that protects HTTP/HTTPS endpoints from common web exploits like SQL injection and cross-site scripting, not a content moderation filter for LLM-generated text.

Practice this question →

MCQeasy

A company is building a customer support chatbot using Amazon Bedrock. They need to store conversation history for context across sessions. Which AWS service is best suited for this purpose?

A.Amazon S3

B.Amazon DynamoDB

C.Amazon RDS

D.Amazon ElastiCache

AnswerB

DynamoDB provides fast, scalable storage for session state and conversation history.

Why this answer

Amazon DynamoDB is a NoSQL database ideal for storing session data due to its low latency and scalability, making it the best choice for conversation history.

Practice this question →

MCQhard

A law firm uses a foundation model to draft legal briefs. To ensure accuracy, they want to ground the model's outputs in authoritative legal sources. They have a large database of prior case law and statutes stored in Amazon S3. The firm's IT team must implement a solution that reduces hallucinations while being cost-effective. The solution should allow the model to retrieve relevant documents and generate responses based on them. Which approach should they take?

A.Fine-tune the model on the legal database.

B.Manually attach relevant documents to each prompt.

C.Use a larger foundation model with more parameters.

D.Use Amazon Bedrock Agents to create a RAG application.

AnswerD

RAG retrieves relevant documents in real-time, providing factual grounding.

Why this answer

Option B is correct because Amazon Bedrock Agents with a knowledge base can implement Retrieval-Augmented Generation (RAG): the agent retrieves relevant documents from S3 and uses them as context for the model, grounding responses and reducing hallucinations. Option A (fine-tuning) is expensive and does not guarantee grounding for all queries. Option C (manual attachment) is not scalable.

Option D (larger model) increases cost without solving hallucination.

Practice this question →

MCQhard

A healthcare company is deploying a conversational AI using a foundation model on Amazon Bedrock for patient triage. The application must minimize hallucinations and ensure factual accuracy. Which combination of techniques should the team implement?

A.Implement Retrieval-Augmented Generation (RAG) using a knowledge base on Amazon Bedrock and a system prompt demanding factual responses.

B.Fine-tune the model on a large dataset of medical transcripts and deploy with default parameters.

C.Use reinforcement learning from human feedback (RLHF) on the deployed model.

D.Set the maxTokens to a low value to force shorter, more focused answers.

AnswerA

RAG retrieves relevant documents to ground the answer, and system prompts can enforce constraints, reducing hallucinations.

Why this answer

Option C is correct because RAG grounds responses in retrieved documents, and system prompts can enforce safety and accuracy constraints. Option A is wrong because fine-tuning alone may still lead to hallucinations if the training data is incomplete. Option B is wrong because RLHF is complex to implement on Bedrock and doesn't directly ground responses.

Option D is wrong because reducing max tokens does not improve accuracy.

Practice this question →

MCQmedium

A developer uses Amazon Bedrock to generate code. Some outputs contain syntax errors. What is the most likely cause?

A.The prompt lacks constraints or examples

B.The max_tokens is too low

C.The temperature is too high

D.The model lacks knowledge of the language

AnswerA

Insufficient guidance leads to incorrect or incomplete code.

Why this answer

Providing clear constraints, examples, and instructions in the prompt is critical for code quality. Lack thereof often leads to errors.

Practice this question →

MCQeasy

Which pricing model does Amazon Bedrock use for foundation model inference?

A.Per-request

B.Per-hour instance

C.Per-GB storage

D.Per-token

AnswerD

Bedrock is pay-per-token for both input and output.

Why this answer

Amazon Bedrock charges per token processed (input and output), making it cost-effective for variable usage.

Practice this question →

MCQhard

A security engineer creates the above IAM policy to allow a user to invoke an Amazon Bedrock model. However, invocation fails. What is the issue?

A.The action should be "bedrock:InvokeModelWithResponseStream".

B.The resource ARN is missing the account ID.

C.The ARN should use "foundation-model" instead of "model".

D.The statement is missing a condition for the model ID.

AnswerC

The resource type for foundation models is 'foundation-model', not 'model'.

Why this answer

Option C is correct because the IAM policy's resource ARN incorrectly uses 'model' in the path, but Amazon Bedrock requires 'foundation-model' to reference foundation models. The correct ARN format for invoking a Bedrock foundation model is 'arn:aws:bedrock:region::foundation-model/model-id'. Using 'model' instead of 'foundation-model' causes the policy to not match any valid Bedrock resource, resulting in an invocation failure.

Exam trap

AWS often tests the distinction between 'model' and 'foundation-model' in Bedrock ARNs, as candidates may assume all Bedrock models use the same resource type, overlooking that foundation models require a specific path.

How to eliminate wrong answers

Option A is wrong because 'bedrock:InvokeModelWithResponseStream' is a separate action for streaming responses, but the standard 'bedrock:InvokeModel' action is sufficient for non-streaming invocation; the failure is not due to the action name. Option B is wrong because the resource ARN for Bedrock foundation models does not require an account ID; the ARN format uses a double colon (::) in the account ID position, which is correct for service-owned resources. Option D is wrong because a condition for the model ID is optional and not required for invocation; the primary issue is the incorrect resource type in the ARN.

Practice this question →

Multi-Selecteasy

Which TWO techniques can reduce the cost of running a fine-tuned foundation model on Amazon SageMaker? (Choose TWO.)

Select 2 answers

A.Implement structured pruning to remove less important model parameters.

B.Use larger instance types with more GPUs to speed up inference.

C.Apply model quantization to reduce precision from FP32 to FP16 or INT8.

D.Store the model parameters in FP32 to maintain accuracy during inference.

E.Increase the number of training epochs to achieve higher accuracy.

AnswersA, C

Pruning creates a smaller model that is cheaper to run.

Why this answer

Structured pruning reduces the number of parameters in the model by removing entire neurons, channels, or layers that contribute little to the output. This directly shrinks the model size and computational requirements, leading to lower memory usage and faster inference on SageMaker, which reduces cost.

Exam trap

AWS often tests the distinction between techniques that reduce inference cost (pruning, quantization) versus those that improve training speed or accuracy, leading candidates to mistakenly select options that increase resource usage or are irrelevant to inference cost.

Practice this question →

MCQeasy

A startup company is developing an e-commerce platform and wants to use Amazon Bedrock to generate product descriptions automatically. They have a small team of developers who are not machine learning experts. The product catalog is stored in a DynamoDB table, and each product has attributes like name, category, price, and a brief description. The company wants the generated descriptions to reflect the unique brand voice, which is documented in a few internal style guides stored as PDF files in Amazon S3. They need a solution that allows them to quickly test the approach without significant infrastructure changes or model training. The development team is familiar with AWS SDKs and want to minimize ongoing maintenance. The team has already set up a Bedrock foundation model (Claude) and can make API calls. They tested simple prompts but the output lacked the brand's informal yet professional tone. They want to incorporate examples from the style guides directly into the prompt without retraining. The team fears that including the entire style guide in each prompt would exceed token limits and increase costs. Which approach should they take to effectively incorporate the brand voice with minimal changes?

A.Fine-tune the foundation model using the style guides with Amazon Bedrock Custom Models.

B.Use Amazon Bedrock with a custom prompt template that includes a few representative examples from the style guides as few-shot examples in the system prompt.

C.Concatenate all style guide PDFs into a single text and include it in every prompt.

D.Use Amazon Comprehend to analyze the style guides and extract a list of keywords to include in the prompt.

AnswerB

Correct: Few-shot examples in the prompt can teach the model the brand voice without retraining. The team can select a few representative examples to keep token count low.

Why this answer

Option A uses in-context learning with carefully selected examples from the style guides, which is simple and avoids retraining. Option B requires fine-tuning, which is complex and not quick. Option C increases token usage and cost.

Option D does not address brand voice.

Practice this question →

MCQmedium

A company uses Amazon Bedrock to automatically generate product descriptions for their e-commerce website. They use a prompt that includes product attributes and a short description as a starting point. Recently, the generated descriptions have become overly verbose, including irrelevant details and sometimes even incorrect product specifications. The team has tried simplifying the prompt and reducing the max tokens, but the issue persists. The descriptions must be concise and accurate. What is the most effective next step to address this problem?

A.Switch to a larger foundation model that handles details better.

B.Increase the temperature parameter to 0.9 to make the model more deterministic.

C.Decrease the top_p parameter to 0.1 and keep max tokens low.

D.Use a negative prompt specifying 'do not include unnecessary details'.

AnswerC

Lowering top_p focuses on the most likely tokens, reducing irrelevant details; low max tokens enforces conciseness.

Why this answer

Option B is correct because decreasing the top_p parameter to 0.1 forces the model to choose from a smaller, more probable set of tokens, making the output more focused and less likely to include irrelevant information. Keeping max tokens low enforces conciseness. Option A (increase temperature) would increase randomness and potentially worsen the issue.

Option C (switch to larger model) may increase verbosity and cost without guarantee of improvement. Option D (negative prompt) might help but is less reliable than parameter tuning.

Practice this question →

Multi-Selecthard

A marketing team is using a foundation model to generate marketing copy. Which THREE of the following should they consider to ensure responsible and cost-effective use?

Select 3 answers

A.Bias mitigation to avoid unfair stereotypes

B.Cost per token for the model

C.Model size (number of parameters)

D.Toxicity detection in generated content

E.Latency of model inference

AnswersA, B, D

Reduces risk of biased messaging that can harm brand reputation.

Why this answer

Option A is correct because bias mitigation is essential for responsible AI use; foundation models can perpetuate harmful stereotypes if not carefully monitored, and the marketing team must ensure their generated copy does not unfairly target or misrepresent any group. This aligns with AWS's responsible AI principles, including fairness and avoiding bias in model outputs.

Exam trap

AWS often tests the misconception that model size (parameters) is a key cost driver, but in practice, cost is tied to token consumption and inference infrastructure, not just parameter count, and latency is a performance metric, not a cost or responsibility factor.

Practice this question →

MCQmedium

A company is using a foundation model on Amazon Bedrock to generate customer support responses. They notice that the model sometimes produces harmful or offensive content. Which approach is MOST effective to mitigate this issue?

A.Use prompt engineering to instruct the model to avoid harmful content

B.Enable model invocation logging to review and block responses

C.Fine-tune the model on a curated dataset of safe responses

D.Configure Amazon Bedrock Guardrails with content filters

AnswerD

Guardrails provide configurable filters that block harmful content at inference time.

Why this answer

Amazon Bedrock Guardrails provides configurable content filters that can block harmful, offensive, or inappropriate content in both user inputs and model outputs. This is the most effective approach because it operates at the inference layer, applying safety policies consistently across all requests without requiring model retraining or manual review. Prompt engineering alone is unreliable, and fine-tuning may not generalize to all harmful content patterns.

Exam trap

Cisco often tests the misconception that prompt engineering or fine-tuning alone is sufficient for safety, when in fact a dedicated guardrail mechanism is required for reliable, policy-based content filtering at inference time.

How to eliminate wrong answers

Option A is wrong because prompt engineering can be easily bypassed by adversarial inputs or model drift, and it does not provide deterministic enforcement of safety policies. Option B is wrong because model invocation logging only records responses for auditing; it does not block harmful content in real time. Option C is wrong because fine-tuning on a curated dataset of safe responses reduces but does not eliminate the risk of generating harmful content, especially for edge cases or novel inputs not seen during training.

Practice this question →

Multi-Selecteasy

A company uses Amazon Bedrock to build a question-answering system. Which THREE features of Amazon Bedrock can improve answer accuracy? (Choose three.)

Select 3 answers

A.Retrieval Augmented Generation (RAG)

B.Auto-scaling of provisioned throughput

C.Model fine-tuning

D.Encryption at rest

E.Prompt engineering

AnswersA, C, E

RAG retrieves factual information from a knowledge base to improve answer accuracy.

Why this answer

Retrieval Augmented Generation (RAG) improves answer accuracy by retrieving relevant, up-to-date information from external knowledge bases (e.g., Amazon OpenSearch Serverless or Aurora) and providing it as context to the foundation model. This grounds the model's response in factual data, reducing hallucinations and enabling accurate answers without retraining.

Exam trap

AWS often tests the distinction between features that improve accuracy (RAG, fine-tuning, prompt engineering) versus features that improve operational aspects like scalability (auto-scaling) or security (encryption), leading candidates to mistakenly select non-accuracy-related options.

Practice this question →

Multi-Selecteasy

Which THREE of the following are benefits of using Amazon Bedrock for foundation models?

Select 3 answers

A.Ability to fine-tune models

B.Access to multiple models via single API

C.Guaranteed output accuracy

D.Built-in monitoring and governance

E.Serverless infrastructure

AnswersB, D, E

Bedrock provides a single API to invoke various models.

Why this answer

Amazon Bedrock provides a single API endpoint that allows you to access multiple foundation models from different providers (e.g., Anthropic, AI21 Labs, Meta, Stability AI) without managing separate integrations. This simplifies application development and reduces operational overhead by abstracting the underlying model infrastructure.

Exam trap

Cisco often tests the distinction between a service's inherent benefits (like serverless infrastructure and unified API) and optional features (like fine-tuning), leading candidates to mistakenly select fine-tuning as a universal benefit.

Practice this question →

Multi-Selectmedium

A company is building a chatbot using Amazon Bedrock. They want to provide up-to-date information from a continuously changing database. Which TWO services can be used as a data source for a Bedrock knowledge base? (Select TWO.)

Select 2 answers

A.Amazon Kendra

B.Amazon S3

C.Amazon RDS for MySQL

D.Amazon DynamoDB

E.AWS Glue

AnswersA, B

A Kendra index can be used as a knowledge base source in Bedrock.

Why this answer

Amazon Bedrock knowledge bases can directly ingest data from Amazon S3, which is a supported data source for indexing documents. Amazon Kendra is also a supported data source, allowing Bedrock to leverage existing Kendra indexes for retrieval-augmented generation (RAG). Both services integrate natively with Bedrock knowledge bases to provide up-to-date information from continuously changing data.

Exam trap

AWS often tests the misconception that any AWS database or data processing service can be a direct data source for Bedrock knowledge bases, but only S3, Kendra, and Salesforce are supported.

Practice this question →

MCQhard

A marketing firm uses Amazon Bedrock to generate ad copy. They notice that the generated text often includes factual inaccuracies about their products. Which technique would most effectively reduce these inaccuracies?

A.Implement Retrieval-Augmented Generation (RAG) with a product knowledge base.

B.Use longer, more detailed prompts.

C.Increase the temperature parameter to 0.9.

D.Fine-tune the model on a dataset of previous ad copies.

AnswerA

RAG enables the model to retrieve and cite authoritative information, reducing hallucinations.

Why this answer

Retrieval-Augmented Generation (RAG) grounds the model's output in a trusted, external knowledge base by retrieving relevant product documents before generating text. This directly addresses factual inaccuracies because the model references authoritative data rather than relying solely on its parametric memory, which may contain outdated or incorrect information.

Exam trap

Cisco often tests the misconception that fine-tuning or prompt engineering alone can fix factual accuracy issues, when in reality RAG is the standard solution for grounding model outputs in external, verifiable data.

How to eliminate wrong answers

Option B is wrong because longer prompts do not fix the underlying knowledge gap; they only provide more context but cannot inject new, accurate facts that the model lacks. Option C is wrong because increasing temperature to 0.9 increases randomness and creativity, which would likely worsen factual inaccuracies by encouraging more hallucinated or divergent outputs. Option D is wrong because fine-tuning on previous ad copies would reinforce existing patterns and biases, including any inaccuracies present in the training data, rather than introducing a reliable source of truth.

Practice this question →

MCQhard

A media company is using Amazon Bedrock to generate marketing copy with a foundation model. They want to ensure the output adheres to brand voice guidelines (e.g., friendly, professional). Which prompt engineering strategy is most effective for this requirement?

A.Provide five example outputs in the prompt that match the desired tone.

B.Include instructions like 'Do not use technical jargon' in every user prompt.

C.Set the temperature parameter to a low value (e.g., 0.1) to reduce randomness.

D.Use a system prompt that explicitly describes the brand voice and expectations.

AnswerD

System prompts set the role and tone, effectively guiding the model's style for all subsequent interactions.

Why this answer

Option D is correct because Amazon Bedrock supports system prompts that set overarching context and behavioral guidelines for the model. By explicitly describing the brand voice (e.g., 'friendly, professional') in the system prompt, the model consistently applies these constraints across all user interactions, which is more effective than per-instruction tuning.

Exam trap

AWS often tests the misconception that parameter tuning (like temperature) or few-shot examples are sufficient for style control, when in fact system prompts provide the most direct and scalable mechanism for enforcing behavioral constraints in foundation models.

How to eliminate wrong answers

Option A is wrong because providing example outputs (few-shot prompting) can guide tone but is less reliable than a system prompt for consistent adherence across diverse inputs, and it consumes prompt token budget without guaranteeing the model internalizes the rule. Option B is wrong because including instructions like 'Do not use technical jargon' in every user prompt is redundant, inefficient, and can be overridden by the model's tendency to follow the most recent instruction, whereas a system prompt sets a persistent baseline. Option C is wrong because lowering the temperature parameter reduces randomness but does not enforce specific brand voice constraints; it only makes outputs more deterministic, which may still produce off-tone content if the model's training data lacks the desired style.

Practice this question →

MCQmedium

A healthcare company uses Amazon Bedrock to generate patient summaries. They need to ensure no protected health information (PHI) is leaked in the output. Which AWS service can they use to detect and mask PHI in text?

A.Amazon Comprehend Medical

B.Amazon Macie

C.AWS Glue

D.Amazon Rekognition

AnswerA

Comprehend Medical can identify and mask PHI such as patient names and dates.

Why this answer

Amazon Comprehend Medical is specifically designed to extract and identify protected health information (PHI) from unstructured medical text using natural language processing (NLP). It can detect entities such as patient names, dates, medical conditions, and medications, and provides APIs to mask or redact that PHI before output. This makes it the correct choice for the healthcare company's requirement to prevent PHI leakage in patient summaries generated by Amazon Bedrock.

Exam trap

AWS often tests the distinction between general-purpose data protection services (like Macie) and domain-specific medical NLP services (like Comprehend Medical), leading candidates to choose Macie because it is associated with sensitive data discovery, even though it cannot perform inline text masking.

How to eliminate wrong answers

Option B (Amazon Macie) is wrong because Macie is a data security service that discovers and protects sensitive data stored in Amazon S3 using machine learning and pattern matching, but it does not provide real-time PHI detection or masking in text streams or API outputs. Option C (AWS Glue) is wrong because Glue is a serverless data integration service for ETL (extract, transform, load) jobs, not a text analysis or PHI detection service. Option D (Amazon Rekognition) is wrong because Rekognition is an image and video analysis service that can detect objects, faces, and text in media, but it is not designed to identify or mask PHI in textual data.

Practice this question →

MCQeasy

A startup needs to generate product descriptions from bullet points using a foundation model. They want a fully managed serverless experience. Which AWS service should they use?

A.Amazon Comprehend

B.Amazon Bedrock

C.Amazon Polly

D.Amazon Lex

AnswerB

Bedrock offers serverless foundation models for generation tasks.

Why this answer

Amazon Bedrock is a fully managed serverless service that provides access to foundation models (FMs) from leading AI providers via an API, making it ideal for generating product descriptions from bullet points. It eliminates infrastructure management while allowing you to invoke models like Anthropic Claude or Amazon Titan for text generation tasks.

Exam trap

The trap here is that candidates confuse Amazon Comprehend (a text analysis service) with a generative AI service, or assume Polly or Lex can generate text descriptions when they are specialized for speech and conversation, respectively.

How to eliminate wrong answers

Option A is wrong because Amazon Comprehend is a natural language processing (NLP) service for extracting insights (e.g., sentiment, entities) from text, not for generating new content from bullet points. Option C is wrong because Amazon Polly is a text-to-speech service that converts text into lifelike speech, not a foundation model for text generation. Option D is wrong because Amazon Lex is a service for building conversational interfaces (chatbots) using automatic speech recognition and natural language understanding, not for generating product descriptions from bullet points.

Practice this question →

MCQmedium

A healthcare company processes patient records using a foundation model on Amazon Bedrock. They must ensure no patient data is used to improve the base model. What is the MOST effective configuration?

A.Create a VPC endpoint for Bedrock

B.Set a data retention policy to store logs for only 30 days

C.Disable model improvement logging in the Bedrock service settings

D.Enable data encryption at rest and in transit

AnswerC

Disabling model improvement prevents AWS from using the data to improve the base model.

Why this answer

Option C is correct because disabling model improvement logging ensures data is not used for training. Option A (data encryption) does not prevent usage for training. Option B (private endpoints) secures traffic but not data usage.

Option D (retain logs) could still allow AWS to use data for model improvement.

Practice this question →

Multi-Selectmedium

A company is using Amazon Bedrock to generate images. They want to ensure the outputs comply with content policies. Which TWO AWS services can help? (Choose two.)

Select 2 answers

A.Amazon Augmented AI (A2I)

B.Amazon Rekognition

C.Amazon GuardDuty

D.AWS WAF

E.Amazon Comprehend

AnswersA, B

A2I enables human review of flagged images to enforce content policies.

Why this answer

Amazon Augmented AI (A2I) is correct because it enables human review of model predictions to ensure compliance with content policies. For image generation on Bedrock, A2I can route outputs that fall below a confidence threshold to human reviewers, providing a safety net for policy adherence.

Exam trap

AWS often tests the distinction between services that analyze images (Rekognition) versus text (Comprehend), and candidates may mistakenly choose Comprehend for image content moderation without recognizing it is NLP-only.

Practice this question →

MCQmedium

A multinational corporation uses a foundation model via Amazon Bedrock to translate internal communication documents from English to multiple languages. They notice that the translations often miss company-specific jargon and acronyms, leading to confusion. The company has a glossary of approved translations for terms like 'Project Atlas' and 'Operation Synergy.' They want to improve translation accuracy quickly and with minimal effort. What approach should they take?

A.Use prompt engineering to include the glossary in each translation request.

B.Use a larger foundation model that has better language understanding.

C.Fine-tune the foundation model on a corpus of bilingual company documents.

D.Switch to Amazon Translate with custom terminology.

AnswerA

Including the glossary in the prompt directly informs the model of the correct translations.

Why this answer

Option B is correct because including the glossary in the prompt is a simple and effective method: the model can use the provided translations for specific terms. Option A (fine-tuning) requires data preparation and training. Option C (Amazon Translate with custom terminology) is a different service not using the FM.

Option D (larger model) may not address specific jargon.

Practice this question →

MCQmedium

A developer is using Amazon Bedrock to generate text summaries. The output sometimes includes irrelevant information. What is the most effective prompt engineering technique to improve relevance?

A.Add a negative prompt specifying what to avoid

B.Use few-shot examples with summaries

C.Increase max tokens

D.Decrease temperature

AnswerB

Few-shot examples show the model desired output patterns, improving relevance.

Why this answer

Few-shot examples provide the model with explicit patterns of desired output, directly guiding it to produce summaries that match the format and content of the examples. This technique is the most effective for improving relevance because it gives the model concrete reference points, reducing the likelihood of including irrelevant information.

Exam trap

AWS often tests the misconception that adjusting generation parameters (like temperature or token limits) can substitute for explicit prompt structure, when in fact few-shot examples directly teach the model the expected output format and content relevance.

How to eliminate wrong answers

Option A is wrong because negative prompts (e.g., 'avoid irrelevant details') are less reliable in foundation models; they can be ignored or misinterpreted, and they do not provide the structured guidance that few-shot examples offer. Option C is wrong because increasing max tokens only expands the output length, which can actually increase the chance of including irrelevant information rather than improving relevance. Option D is wrong because decreasing temperature reduces randomness but does not teach the model what relevant content looks like; it may still produce irrelevant information if the prompt lacks clear examples.

Practice this question →

Multi-Selecthard

A data scientist is fine-tuning a foundation model on SageMaker. They want to prevent overfitting. Which THREE actions can help? (Select THREE.)

Select 3 answers

A.Apply dropout

B.Increase training data size

C.Increase the number of epochs

D.Use early stopping

E.Use a smaller learning rate

AnswersA, B, D

Dropout prevents co-adaptation of neurons.

Why this answer

Option A is correct because dropout is a regularization technique that randomly drops a fraction of neurons during training, which prevents the model from relying too heavily on specific features and reduces overfitting. In SageMaker, dropout can be applied via framework-specific APIs (e.g., `tf.keras.layers.Dropout` in TensorFlow) or by configuring the model architecture in the training script.

Exam trap

AWS often tests the misconception that increasing epochs or using a smaller learning rate directly prevents overfitting, when in fact these are hyperparameter tuning strategies that can exacerbate or fail to address overfitting without explicit regularization.

Practice this question →

MCQhard

A financial services company uses a foundation model for document analysis. They need to ensure the model does not output sensitive customer information from its training data. What is the most effective mitigation?

A.Implement output filtering using an external service

B.Choose a model that has been fine-tuned on financial data

C.Apply data masking before sending input

D.Use a private endpoint

AnswerA

Output filtering can scan and block responses containing sensitive data.

Why this answer

Output filtering using an external service is the most effective mitigation because it acts as a post-processing layer that can detect and redact sensitive customer information (e.g., PII, account numbers) before the model's response is returned to the user. This approach does not rely on the model's internal training or input modifications, which can be bypassed or incomplete. It provides a robust, policy-driven control that can be updated independently of the model.

Exam trap

The trap here is that candidates often confuse input-side controls (like data masking or fine-tuning) with output-side controls, assuming that protecting the input or training the model on domain data is sufficient to prevent leakage of memorized sensitive information.

How to eliminate wrong answers

Option B is wrong because fine-tuning on financial data does not guarantee the model will not memorize or regurgitate sensitive customer information from its original training data; fine-tuning adjusts the model's behavior but does not erase existing memorized data. Option C is wrong because data masking before sending input only protects the input data, not the model's outputs; the model could still output sensitive information from its training data that was never masked. Option D is wrong because using a private endpoint secures the network connection and access control but does not prevent the model from generating outputs containing sensitive training data; it addresses data-in-transit security, not output content safety.

Practice this question →

Multi-Selecthard

Which THREE are best practices for ensuring generated content complies with corporate brand guidelines when using Amazon Bedrock?

Select 3 answers

A.Implement guardrails to restrict tone, topics, and language

B.Use prompt engineering to specify brand voice and style

C.Increase the temperature for more creative outputs

D.Use random prompts to test variability

E.Fine-tune the model on a dataset of brand-compliant content

AnswersA, B, E

Guardrails enforce content policies at inference time.

Why this answer

Option A is correct because Amazon Bedrock Guardrails allow you to define policies that restrict the model's output to specific tones, topics, and language, ensuring alignment with corporate brand guidelines. By configuring denied topics and content filters, you can prevent the model from generating off-brand or inappropriate content, directly enforcing compliance at the inference layer.

Exam trap

AWS often tests the misconception that increasing temperature or using random prompts can help enforce brand guidelines, when in fact these actions increase variability and reduce control, directly opposing the goal of compliance.

Practice this question →

MCQmedium

A company fine-tunes a foundation model on SageMaker using a custom dataset. They notice the training job takes too long. Which optimization technique is specifically designed to reduce training time for foundation models?

A.Distributed training using SageMaker Data Parallelism

B.Using a smaller instance type

C.Using Spot Instances

D.Reducing batch size

AnswerA

Data parallelism partitions the data and trains across multiple devices, reducing wall-clock time.

Why this answer

SageMaker Data Parallelism distributes the training workload across multiple GPUs or instances, splitting the data and synchronizing gradients using optimized all-reduce algorithms. This specifically reduces training time for large foundation models by enabling parallel computation, which is the most direct technique for accelerating training at scale.

Exam trap

AWS often tests the misconception that cost-saving techniques like Spot Instances or smaller instances also improve performance, but the question specifically asks for optimization to reduce training time, not cost.

How to eliminate wrong answers

Option B is wrong because using a smaller instance type reduces computational capacity, which would increase training time rather than reduce it. Option C is wrong because Spot Instances reduce cost by using spare AWS capacity, but they do not inherently speed up training; they may even cause interruptions that prolong total time. Option D is wrong because reducing batch size can actually slow convergence and increase the number of training steps, potentially increasing overall training time.

Practice this question →

Multi-Selectmedium

A company is using Amazon Bedrock to generate marketing content. They want to evaluate the quality of the generated text. Which TWO metrics are most appropriate for evaluating text quality?

Select 2 answers

A.Precision

B.Perplexity

C.Accuracy

D.F1 score

E.BLEU (Bilingual Evaluation Understudy)

AnswersB, E

Perplexity measures how well the model predicts the text.

Why this answer

Perplexity measures how well a language model predicts a sample, with lower values indicating higher confidence and coherence in generated text. BLEU evaluates the overlap between generated text and reference text, making it suitable for assessing fluency and relevance in content generation tasks like marketing copy.

Exam trap

AWS often tests the distinction between classification metrics (precision, accuracy, F1) and generation evaluation metrics (perplexity, BLEU), leading candidates to mistakenly apply classification concepts to text quality assessment.

Practice this question →

MCQhard

A developer sends the above request to Amazon Bedrock with Anthropic Claude. The model returns a response that stops before reaching 500 tokens. What is the most likely reason?

A.The temperature is set too high

B.The model is not trained on this topic

C.The model reached a stop sequence

D.The token limit is exceeded

AnswerC

The model can stop early when it identifies a natural endpoint.

Why this answer

The model stopped before reaching 500 tokens because the request likely included a stop sequence (e.g., `\n\nHuman:` or a custom stop token) that matched the generated output. When a stop sequence is encountered, Bedrock immediately halts generation, even if the token limit has not been reached. This is the most direct explanation for a premature stop.

Exam trap

AWS often tests the distinction between a stop sequence and a token limit; the trap here is that candidates confuse a premature stop with exceeding the token limit, but a stop sequence causes an early halt while a token limit would cause truncation at the limit.

How to eliminate wrong answers

Option A is wrong because a high temperature increases randomness and can cause the model to generate more tokens or diverge, not stop early. Option B is wrong because Bedrock's Claude models are trained on a broad corpus and can generate responses on any topic; lack of training would produce low-quality or repetitive text, not a stop before the token limit. Option D is wrong because if the token limit were exceeded, the model would truncate the response at the limit, not stop before reaching it.

Practice this question →

MCQmedium

A company wants to build a customer service chatbot using a foundation model. The chatbot must respond in under 2 seconds and handle high throughput. Which model deployment option should they choose?

A.Amazon Bedrock on-demand

B.Amazon SageMaker real-time endpoint hosting a foundation model

C.Amazon Lex with a pre-built foundation model

D.Amazon Bedrock provisioned throughput

AnswerD

Provisioned throughput reserves capacity and ensures consistent low latency.

Why this answer

Option B is correct because Amazon Bedrock provisioned throughput guarantees a set number of tokens per minute with low latency, meeting the response time requirement. Option A (on-demand) may have cold starts. Option C (SageMaker real-time) may not be optimized for FMs and requires more management.

Option D (Lex with pre-built FM) may not have the required flexibility.

Practice this question →

Multi-Selecthard

A data science team is fine-tuning a foundation model on Amazon SageMaker. Which THREE steps are part of the best practice? (Choose three.)

Select 3 answers

A.Increase model size to improve performance.

B.Monitor for catastrophic forgetting during fine-tuning.

C.Use early stopping to prevent overfitting.

D.Deploy the model to production immediately after fine-tuning.

E.Use a diverse dataset representing various scenarios.

AnswersB, C, E

Catastrophic forgetting can cause loss of original capabilities; monitoring helps adjust training.

Why this answer

Option B is correct because catastrophic forgetting is a known risk when fine-tuning foundation models, where the model loses previously learned knowledge while adapting to new data. Monitoring for this during fine-tuning on SageMaker allows the team to detect performance degradation on the original task and adjust training accordingly, ensuring the model retains its general capabilities.

Exam trap

AWS often tests the misconception that fine-tuning always requires a larger model or immediate deployment, while the real best practices focus on validation, monitoring, and data diversity to maintain model robustness.

Practice this question →

MCQmedium

An e-commerce company uses Amazon Bedrock to generate product descriptions. They notice the descriptions are too long and contain repetitive phrases. Which parameter adjustment can help?

A.Increase frequency penalty

B.Increase temperature

C.Increase top_p

D.Decrease presence penalty

AnswerA

Frequency penalty reduces the likelihood of repeating tokens.

Why this answer

Increasing the frequency penalty reduces the likelihood of the model repeating the same phrases or tokens, directly addressing the issue of repetitive language in generated product descriptions. This parameter penalizes tokens that have already appeared in the text, encouraging more diverse output and naturally shortening overly long descriptions by avoiding redundant loops.

Exam trap

AWS often tests the distinction between frequency penalty and presence penalty, where candidates confuse 'penalizing repetition' with 'reducing randomness' and incorrectly choose temperature or top_p adjustments.

How to eliminate wrong answers

Option B is wrong because increasing temperature makes the model more random and creative, which could actually worsen verbosity and introduce more irrelevant phrases rather than reducing repetition. Option C is wrong because increasing top_p (nucleus sampling) expands the set of possible next tokens, which may increase diversity but does not specifically penalize repeated tokens and can still produce long, repetitive text. Option D is wrong because decreasing presence penalty would reduce the penalty for tokens that have already appeared, making the model more likely to repeat itself, which is the opposite of what is needed.

Practice this question →

MCQmedium

A company runs a chatbot using a large language model on Amazon Bedrock. They notice high latency during peak hours. Which action would be MOST effective to reduce latency without degrading response quality?

A.Increase the number of concurrent invocations

B.Switch to a smaller model

C.Decrease the maxTokens parameter

D.Use Provisioned Throughput for model inference

AnswerD

Provisioned Throughput ensures reserved capacity, reducing latency variability.

Why this answer

Provisioned Throughput on Amazon Bedrock reserves dedicated capacity for model inference, ensuring consistent low latency even during peak hours. This eliminates the variability caused by resource contention in the on-demand tier, directly addressing high latency without altering model size or output quality.

Exam trap

AWS often tests the misconception that reducing model size or output length is the primary way to reduce latency, but the real bottleneck in peak-hour scenarios is often infrastructure contention, which Provisioned Throughput resolves without sacrificing quality.

How to eliminate wrong answers

Option A is wrong because increasing concurrent invocations without dedicated capacity can exacerbate resource contention, leading to throttling and higher latency. Option B is wrong because switching to a smaller model reduces response quality (e.g., lower accuracy or coherence), which degrades the chatbot's performance. Option C is wrong because decreasing maxTokens truncates responses, degrading output quality by cutting off reasoning or context, and does not address the root cause of latency from infrastructure contention.

Practice this question →

MCQeasy

A developer is using Amazon Bedrock to build a chatbot that answers customer queries. The chatbot must only respond based on the provided company documentation. Which approach best meets this requirement?

A.Use prompt engineering to instruct the model to only use documentation.

B.Use a RAG architecture with the company documentation as the knowledge base.

C.Fine-tune a foundation model on the company documentation.

D.Use a text classification model to filter responses.

AnswerB

RAG ensures responses are based on retrieved documents.

Why this answer

Option B is correct because Retrieval-Augmented Generation (RAG) architecture retrieves relevant chunks from the company documentation at query time and injects them into the prompt, ensuring the model's response is grounded solely in the provided documents. This approach prevents the model from relying on its internal training data or generating information outside the documentation, which is critical for a closed-domain chatbot.

Exam trap

Cisco often tests the distinction between prompt engineering and RAG, where candidates mistakenly believe a well-crafted prompt can fully control model behavior without a retrieval mechanism, overlooking the fact that foundation models inherently generate responses from their training data unless explicitly grounded via external knowledge retrieval.

How to eliminate wrong answers

Option A is wrong because prompt engineering alone cannot guarantee the model will ignore its pre-trained knowledge; the model may still hallucinate or use information not present in the documentation, as it has no mechanism to enforce retrieval of specific content. Option C is wrong because fine-tuning a foundation model on the company documentation embeds the data into the model's weights, which can lead to outdated or incomplete responses and does not allow dynamic retrieval of the latest documentation; it also risks overfitting and does not scale well with changing content. Option D is wrong because a text classification model filters responses after generation, but it cannot ensure the response is based on the documentation; it only labels or rejects outputs, which is insufficient for generating accurate, document-grounded answers.

Practice this question →

MCQhard

A company operates a customer service platform that uses Amazon Bedrock with a foundation model to generate automated responses. The system has been in production for three months. Recently, customers have reported that responses are becoming repetitive and less relevant over time. The development team notices that the model's performance has degraded, especially for queries about newer products that were added after the initial deployment. The team currently uses a static prompt with a fixed knowledge base that was set up at launch. The model is invoked via the Bedrock API with standard settings. The team wants to improve response quality without incurring high costs or extensive re-engineering. What should the team do?

A.Increase the temperature parameter to 0.9 to introduce more randomness and reduce repetition.

B.Fine-tune the model every week on the latest customer interactions using Amazon SageMaker.

C.Switch to a larger foundation model to handle the increased complexity of new products.

D.Implement a feedback loop to periodically update the knowledge base with new product information and use a dynamic prompt that includes recent interactions.

AnswerD

Continuously updating the knowledge base and prompt keeps responses accurate and fresh.

Why this answer

Implementing a feedback loop to update the knowledge base with new product information and using a dynamic prompt that includes recent interactions will keep the model relevant and reduce repetition. This RAG approach is cost-effective.

Practice this question →

MCQhard

A company uses Amazon Bedrock to generate product descriptions. They want to ensure the outputs consistently follow a specific brand tone (professional yet friendly). They have a small set of example descriptions (few-shot examples) but do not want to fine-tune the model. Which strategy best achieves consistent tone without modifying the base model?

A.Fine-tune the model on a dataset of product descriptions that exemplify the desired tone.

B.Use a system prompt that defines the brand tone and include few-shot examples in the prompt.

C.Implement prompt chaining by breaking the task into multiple steps, each with its own prompt.

D.Use retrieval-augmented generation (RAG) to pull example descriptions from a database and prepend them to the prompt.

AnswerB

A system prompt sets the context and few-shot examples demonstrate the desired output style, guiding the model at inference time.

Why this answer

Using a system prompt and few-shot examples in the prompt template (Option D) provides explicit guidance to the model at inference time, shaping the tone without any model updates. Option A (retrieval-augmented generation) is for incorporating external knowledge, not tone. Option B (prompt chaining) adds complexity and may not directly enforce tone.

Option C (fine-tuning) requires modifying model weights, which is not desired.

Practice this question →

MCQeasy

A company needs to summarize thousands of customer reviews daily using a foundation model. The solution must minimize latency and cost while handling variable traffic. Which AWS service should they use?

A.Amazon SageMaker real-time endpoint

B.Amazon Comprehend

C.Amazon Lex

D.Amazon Bedrock with on-demand mode

AnswerD

Bedrock on-demand is serverless and cost-effective for variable workloads.

Why this answer

Option C is correct because Amazon Bedrock with on-demand mode provides serverless access to foundation models, paying only per token used, which minimizes cost and latency for variable traffic. Option A (Amazon Comprehend) is not built for custom summarization with FMs. Option B (Amazon SageMaker real-time endpoint) requires managing infrastructure and is not cost-effective for variable loads.

Option D (Amazon Lex) is for chatbots, not summarization.

Practice this question →

MCQmedium

A company uses Amazon Bedrock to generate marketing copy. The summaries are too verbose. Which parameter should be decreased to directly limit the length of the output?

A.max_tokens

B.temperature

C.top_p

D.frequency_penalty

AnswerA

max_tokens sets the maximum number of tokens in the response.

Why this answer

The `max_tokens` parameter directly controls the maximum number of tokens (words or subwords) in the generated output. By decreasing this value, you explicitly cap the length of the marketing copy, making it less verbose. This is the most direct way to limit output length in Amazon Bedrock and other LLM APIs.

Exam trap

The trap here is that candidates confuse parameters that affect output style (temperature, top_p, frequency_penalty) with the one that directly controls output length (max_tokens), leading them to pick a parameter that changes how the model writes rather than how much it writes.

How to eliminate wrong answers

Option B (temperature) is wrong because it controls the randomness of token selection, not the length of the output; lowering temperature makes output more deterministic but does not shorten it. Option C (top_p) is wrong because it sets a cumulative probability threshold for nucleus sampling, affecting diversity of token choices, not the total number of tokens generated. Option D (frequency_penalty) is wrong because it penalizes tokens based on their frequency in the generated text, reducing repetition but not directly limiting the overall length of the response.

Practice this question →

MCQmedium

A healthcare company is using Amazon Bedrock to summarize patient notes. The compliance team requires that no patient data is used to improve the underlying foundation model. Which configuration should the team choose?

A.Enable data encryption in transit and at rest.

B.Use a different foundation model from a different provider.

C.Disable model training data logging in the AWS console.

D.Configure a VPC endpoint for Amazon Bedrock.

AnswerC

This setting prevents prompts and completions from being used for model improvement.

Why this answer

Option C is correct because disabling model training data logging in the AWS console prevents Amazon Bedrock from using customer inference data to improve the underlying foundation model. This setting ensures compliance with the requirement that no patient data is used for model training, as Bedrock offers a specific toggle to opt out of data sharing for model improvement.

Exam trap

AWS often tests the misconception that encryption or network controls (like VPC endpoints) are sufficient for data privacy compliance, when the actual requirement is about preventing data usage for model improvement, which is a separate policy control.

How to eliminate wrong answers

Option A is wrong because enabling data encryption in transit and at rest protects data confidentiality but does not prevent the foundation model provider from using the data for training or improvement. Option B is wrong because using a different foundation model from a different provider does not inherently guarantee that patient data will not be used for model improvement; the compliance requirement is about data usage policy, not model origin. Option D is wrong because configuring a VPC endpoint for Amazon Bedrock controls network access and data exfiltration but does not affect whether inference data is logged or used for model training.

Practice this question →

MCQhard

A healthcare organization is using Amazon Bedrock to analyze medical images and generate radiology reports. They need to comply with HIPAA regulations and ensure patient data is not used for model training. Which configuration should they use?

A.Fine-tune the model using a custom dataset and deploy as a custom model

B.Use a provisioned throughput model with data isolation

C.Use the on-demand model through Amazon Bedrock

D.Use a third-party model hosted outside of AWS

AnswerB

Provisioned throughput ensures data is not used for training and meets compliance requirements.

Why this answer

Option B is correct because Provisioned Throughput with data isolation in Amazon Bedrock ensures that the customer's inference data (including patient medical images and reports) is not used for any model training or service improvement, and it provides a dedicated, isolated environment that meets HIPAA compliance requirements. This configuration guarantees that patient data remains within the customer's AWS account and is not shared with other customers or used to improve the base model.

Exam trap

The trap here is that candidates often assume fine-tuning (Option A) is the only way to customize models for healthcare, but they overlook that HIPAA prohibits using PHI for training, making Provisioned Throughput with data isolation the correct choice for compliant inference.

How to eliminate wrong answers

Option A is wrong because fine-tuning a model with a custom dataset would use patient data to train the model, which violates HIPAA requirements that patient data must not be used for model training. Option C is wrong because the on-demand model through Amazon Bedrock does not provide data isolation; inference data may be used for service improvement and model training, which is not HIPAA-compliant for protected health information. Option D is wrong because using a third-party model hosted outside of AWS would require the healthcare organization to manage HIPAA compliance independently, and it does not leverage AWS's HIPAA-eligible services or the data isolation guarantees provided by Bedrock's Provisioned Throughput.

Practice this question →

MCQeasy

A company wants to use a foundation model to classify customer feedback into positive, neutral, negative. They have a small labeled dataset. What approach yields best results?

A.Use a pre-built sentiment analysis API

B.Fine-tune a foundation model on their dataset

C.Fine-tune a foundation model on their dataset

D.Use zero-shot classification

AnswerC

Fine-tuning with a small labeled dataset adapts the model effectively.

Why this answer

Option C is correct because fine-tuning a foundation model on a small labeled dataset allows the model to adapt its pre-trained knowledge specifically to the company's sentiment classification task, achieving higher accuracy than zero-shot or generic API approaches. Fine-tuning adjusts the model's weights using the labeled examples, making it sensitive to domain-specific language and nuance in customer feedback, which is critical for a three-class sentiment task.

Exam trap

AWS often tests the misconception that zero-shot classification is always sufficient for small datasets, but the trap here is that fine-tuning with even a small labeled dataset yields better results because it adapts the model to the specific task, whereas zero-shot lacks task-specific learning.

How to eliminate wrong answers

Option A is wrong because a pre-built sentiment analysis API is typically trained on general data and may not capture domain-specific language or the exact three-class (positive, neutral, negative) granularity required, leading to lower accuracy on the company's specific feedback. Option B is wrong because it is identical to option C and is marked as incorrect in the question; the correct answer is explicitly labeled as C. Option D is wrong because zero-shot classification relies on the model's pre-existing knowledge without any task-specific adaptation, which often results in poor performance on nuanced sentiment classification, especially with a small labeled dataset that could be used for fine-tuning.

Practice this question →

MCQeasy

Refer to the exhibit. This is an Amazon Bedrock invocation request for Claude. What is the purpose of the "stop_sequences" parameter?

A.It tells the model to stop generating when it encounters that sequence

B.It specifies a character sequence for the model to include in its response

C.It limits the number of tokens in the response

D.It controls the randomness of the response

AnswerA

Stop sequences cause the model to halt generation at that point.

Why this answer

Option C is correct. Stop sequences tell the model to stop generating when a specified sequence is encountered, preventing the model from generating additional turns. Option A (alternative response) is incorrect.

Option B (maximum tokens is already set). Option D (control randomness is temperature).

Practice this question →

Multi-Selecteasy

A company is using Amazon Bedrock to generate code snippets. They want to ensure the generated code is secure. Which TWO practices should they implement?

Select 2 answers

A.Increase the max token limit to generate longer code.

B.Use guardrails to block insecure code patterns.

C.Set the temperature to 0 for deterministic output.

D.Review and test all generated code before deployment.

E.Use a larger model for better accuracy.

AnswersB, D

Guardrails can filter out harmful content.

Why this answer

Option B is correct because Amazon Bedrock Guardrails allow you to define policies that filter or block generated content containing insecure code patterns, such as SQL injection or hardcoded credentials, before the output is returned. This provides a proactive security layer that prevents insecure code from reaching the user, directly addressing the requirement to ensure generated code is secure.

Exam trap

Cisco often tests the misconception that model parameters like temperature or token limits can substitute for explicit security controls, when in fact only guardrails and human review directly address code security.

Practice this question →

MCQeasy

A developer invokes an Amazon Bedrock model and receives the above response. What does the 'stopReason' field indicate?

A.The model encountered an error.

B.The model reached a defined stop sequence.

C.The model hit the maximum token limit.

D.The model stopped due to a safety filter.

AnswerB

'stop_sequence' indicates the model encountered a user-defined stop sequence.

Why this answer

The 'stopReason' field in an Amazon Bedrock response indicates why the model stopped generating tokens. When set to 'stop', it means the model encountered a defined stop sequence (such as a special token like <|endoftext|> or a user-specified string) and halted generation normally. This is the expected behavior for a successful, complete response.

Exam trap

The trap here is that candidates confuse 'stop' (normal completion via stop sequence) with 'length' (token limit reached), as both end generation but have different implications for response completeness and cost.

How to eliminate wrong answers

Option A is wrong because a model error would typically result in an HTTP error code or a different field like 'error' or 'failure', not a 'stopReason' of 'stop'. Option C is wrong because hitting the maximum token limit would produce a 'stopReason' of 'length', not 'stop'. Option D is wrong because a safety filter intervention would produce a 'stopReason' of 'content_filtered' or similar, not 'stop'.

Practice this question →

MCQeasy

A developer receives the above response from invoking a Bedrock model. Which field indicates that the model completed its response normally?

A.output

B.stop_reason

C.text

D.role

AnswerB

stop_reason 'end_turn' signals normal conversation end.

Why this answer

The `stop_reason` field in the Bedrock response indicates why the model stopped generating text. A value of `"stop"` or `"end_turn"` (depending on the model) signals that the model completed its response normally, as opposed to hitting a token limit, content filter, or other interruption.

Exam trap

The trap here is that candidates confuse the `output` container or the `text` field with the completion indicator, overlooking the dedicated `stop_reason` field that explicitly signals normal termination.

How to eliminate wrong answers

Option A is wrong because `output` is a container object that holds the generated content, not a field that indicates the completion status. Option C is wrong because `text` is a field within the output that contains the actual generated string, but it does not convey why generation stopped. Option D is wrong because `role` indicates the conversational role (e.g., user or assistant) in a multi-turn context, not the model's completion state.

Practice this question →

MCQeasy

A company uses Amazon Bedrock to build a conversational AI. They want to enforce role-based access to the model. Which AWS service should they use?

A.AWS Config

B.AWS Identity and Access Management (IAM)

C.AWS CloudTrail

D.AWS Organizations

AnswerB

IAM policies can control which users or roles can invoke specific Bedrock models.

Why this answer

AWS Identity and Access Management (IAM) is the correct service because it enables fine-grained, role-based access control (RBAC) to Amazon Bedrock models. You can define IAM policies that specify which principals (users, groups, or roles) are allowed to invoke specific foundation models, ensuring that only authorized roles can interact with the conversational AI.

Exam trap

The trap here is that candidates often confuse AWS Config (which audits configurations) or CloudTrail (which logs actions) with IAM, mistakenly thinking that logging or compliance tools can enforce access control, when in fact only IAM provides the authorization layer for Bedrock model invocation.

How to eliminate wrong answers

Option A is wrong because AWS Config is a service for evaluating and auditing resource configurations against compliance rules, not for enforcing role-based access to Bedrock models. Option C is wrong because AWS CloudTrail records API activity for auditing and governance, but it does not control or enforce access permissions. Option D is wrong because AWS Organizations manages multi-account governance and policy inheritance across accounts, but it does not provide the granular, per-model role-based access control needed for Bedrock.

Practice this question →

MCQhard

A company fine-tunes a foundation model on SageMaker JumpStart for sentiment analysis. After deployment, the model shows bias toward positive sentiment. Which action should be taken to mitigate bias?

A.Use a different foundation model

B.Add more positive examples to training data

C.Increase training epochs

D.Perform RLHF (Reinforcement Learning from Human Feedback) to align outputs

AnswerD

RLHF uses human feedback to reduce undesirable biases.

Why this answer

RLHF (Reinforcement Learning from Human Feedback) is the correct approach because it directly addresses the misalignment between the model's outputs and desired human values. By collecting human feedback on model outputs and using it to train a reward model, RLHF fine-tunes the foundation model to reduce biased behavior, such as the over-prediction of positive sentiment, without simply reweighting the training data.

Exam trap

AWS often tests the misconception that bias is solely a data quantity issue, leading candidates to incorrectly choose adding more examples (Option B) instead of recognizing that alignment techniques like RLHF are required to correct model behavior after training.

How to eliminate wrong answers

Option A is wrong because simply switching to a different foundation model does not guarantee the removal of bias; the new model may have its own biases or the same underlying training data issues. Option B is wrong because adding more positive examples would exacerbate the existing bias toward positive sentiment, not mitigate it. Option C is wrong because increasing training epochs does not correct bias; it risks overfitting the model to the existing biased distribution, making the bias worse.

Practice this question →

MCQhard

Which parameter controls the randomness of generated text in a foundation model?

A.top_p

B.stop sequences

C.max_tokens

D.temperature

AnswerD

Temperature directly affects randomness.

Why this answer

Temperature is the correct parameter because it directly controls the randomness of token sampling in a foundation model. A lower temperature (e.g., 0.1) makes the model more deterministic by concentrating probability mass on the most likely tokens, while a higher temperature (e.g., 1.5) flattens the probability distribution, increasing the likelihood of less probable tokens and thus generating more diverse or creative outputs.

Exam trap

AWS often tests the distinction between temperature (which reshapes the probability distribution) and top_p (which truncates the token set), leading candidates to confuse 'randomness control' with 'diversity via cumulative probability threshold'.

How to eliminate wrong answers

Option A is wrong because top_p (nucleus sampling) controls the cumulative probability threshold for token selection, not the randomness of the distribution itself; it dynamically chooses a set of tokens whose cumulative probability exceeds p, which is a different mechanism for diversity. Option B is wrong because stop sequences define specific strings that halt text generation (e.g., '\n\n' or a period), and they have no effect on the randomness or sampling behavior of the model. Option C is wrong because max_tokens sets a hard limit on the number of tokens generated in the output, controlling length rather than the stochasticity of token selection.

Practice this question →

MCQeasy

A startup uses Amazon Bedrock with a provisioned throughput to generate product images. They now have unpredictable traffic and want to reduce costs. What should they do?

A.Switch to batch inference using Amazon Bedrock.

B.Keep the provisioned throughput but reduce the number of units.

C.Use a different model or service like Amazon SageMaker with spot instances.

D.Switch to on-demand mode in Amazon Bedrock.

AnswerD

On-demand mode is serverless and cost-effective for variable traffic.

Why this answer

On-demand mode in Amazon Bedrock allows you to pay per inference request without committing to a provisioned throughput, making it ideal for unpredictable traffic patterns. This eliminates the cost of idle capacity while still providing access to the same foundation models. Option D directly addresses the need to reduce costs when traffic is variable.

Exam trap

The trap here is that candidates may assume provisioned throughput is always more cost-effective for any workload, overlooking that on-demand mode is specifically designed to eliminate idle costs for unpredictable traffic patterns.

How to eliminate wrong answers

Option A is wrong because batch inference is designed for processing large volumes of data asynchronously, not for handling unpredictable real-time traffic, and it still requires provisioning resources that may incur costs even when idle. Option B is wrong because reducing the number of provisioned throughput units still leaves you with committed capacity that must be paid for regardless of usage, which does not solve the cost issue for unpredictable traffic. Option C is wrong because switching to a different model or service like Amazon SageMaker with spot instances introduces additional complexity and does not leverage the native on-demand pricing model of Bedrock, which is specifically designed for variable workloads.

Practice this question →