CCNA Fundamentals of AI and ML Questions

75 of 97 questions · Page 1/2 · Fundamentals of AI and ML · Answers revealed

1
MCQeasy

A data scientist is working with a dataset that contains both numerical and categorical features. Which algorithm is commonly used for regression tasks in AWS SageMaker?

A.K-Means
B.Linear Learner
C.BlazingText
D.Linear Learner
AnswerB

Linear Learner supports regression and classification on numerical and categorical features.

Why this answer

Linear Learner is the correct choice because it is a supervised learning algorithm in AWS SageMaker specifically designed for both regression and classification tasks. It can handle datasets with mixed numerical and categorical features (after appropriate encoding) and provides built-in mechanisms for training linear models, including automatic model tuning and distributed training.

Exam trap

The trap here is that candidates may confuse unsupervised clustering algorithms (like K-Means) with supervised regression algorithms, or mistakenly think that NLP-focused algorithms (like BlazingText) are appropriate for general regression tasks with mixed data types.

How to eliminate wrong answers

Option A is wrong because K-Means is an unsupervised clustering algorithm, not a regression algorithm, and it cannot predict continuous target values. Option C is wrong because BlazingText is optimized for natural language processing tasks such as word embeddings and text classification, not for general regression on mixed numerical/categorical datasets. Option D is wrong because it is a duplicate of the correct answer (B) and does not represent a distinct algorithm; the question lists two identical options, but only one is correct.

2
MCQmedium

A data scientist is using SageMaker to train a model on a dataset with many features. They suspect some features are redundant. Which feature engineering technique would help?

A.Feature scaling
B.One-hot encoding
C.Principal Component Analysis (PCA)
D.Polynomial features
AnswerC

PCA reduces dimensionality by transforming correlated features into uncorrelated components, eliminating redundancy.

Why this answer

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms the original correlated features into a smaller set of uncorrelated principal components, effectively removing redundancy while preserving most of the variance in the data. In SageMaker, PCA can be applied via the built-in PCA algorithm or as a preprocessing step in a scikit-learn container to reduce feature space and eliminate multicollinearity.

Exam trap

Cisco often tests the distinction between feature reduction (PCA) and feature transformation (scaling, encoding, polynomial expansion) to see if candidates confuse techniques that change feature count versus those that only change feature values.

How to eliminate wrong answers

Option A is wrong because feature scaling (e.g., StandardScaler, MinMaxScaler) normalizes the range of features but does not remove redundant or correlated features; it only changes the scale. Option B is wrong because one-hot encoding is used to convert categorical variables into numerical format, not to address feature redundancy among many continuous or numerical features. Option D is wrong because polynomial features create interaction and higher-order terms, which actually increase the number of features and can introduce more redundancy, not reduce it.

3
MCQmedium

A data science team needs to choose a machine learning approach for a project that requires predicting customer churn based on historical data. The team has a labeled dataset with 10,000 records and needs to interpret the model's decisions to provide business insights. Which machine learning technique should the team prioritize?

A.Random forest.
B.K-means clustering.
C.Linear regression.
D.Deep neural network with multiple hidden layers.
AnswerA

Random forests provide feature importance and interpretability, suitable for classification with moderate-sized labeled datasets.

Why this answer

Option C is correct because random forests are ensemble methods that offer feature importance and decision paths, making them interpretable for churn prediction with labeled data. Option A (deep neural network) is less interpretable and may overfit with limited data. Option B (linear regression) is for regression tasks, not classification.

Option D (K-means clustering) is unsupervised and not suitable for predicting churn with labeled data.

4
MCQmedium

An ML team is deploying a real-time inference endpoint for a computer vision model using Amazon SageMaker. The model requires GPU acceleration for low latency. Which instance type should the team choose to minimize cost while meeting the GPU requirement?

A.ml.g5.xlarge
B.ml.c5.xlarge
C.ml.p3.2xlarge
D.ml.p4d.24xlarge
AnswerC

P3 provides GPU acceleration and is cost-effective for inference.

Why this answer

Option C (ml.p3.2xlarge) is correct because it provides a GPU (NVIDIA Tesla V100) necessary for low-latency GPU acceleration in computer vision inference, while being the most cost-effective GPU instance among the options. The ml.p3.2xlarge offers a single GPU with sufficient compute for real-time inference without over-provisioning resources, minimizing cost compared to larger GPU instances like ml.p4d.24xlarge.

Exam trap

The trap here is that candidates may assume any GPU instance is acceptable, but Cisco tests the ability to balance GPU requirement with cost minimization, leading them to pick the cheapest GPU option (ml.g5.xlarge) without recognizing that ml.p3.2xlarge is even more cost-effective for this specific workload.

How to eliminate wrong answers

Option A (ml.g5.xlarge) is wrong because it uses a GPU (NVIDIA A10G) that is more powerful and expensive than needed for this use case, leading to higher cost; however, it does meet the GPU requirement, so the primary issue is cost inefficiency, not technical incompatibility. Option B (ml.c5.xlarge) is wrong because it is a CPU-only instance (based on Intel Xeon Scalable processors) and lacks any GPU, failing to meet the explicit GPU acceleration requirement for low-latency computer vision inference. Option D (ml.p4d.24xlarge) is wrong because it provides 8 NVIDIA A100 GPUs, which is massively over-provisioned for a single real-time inference endpoint, resulting in significantly higher cost without any benefit for this workload.

5
Multi-Selecthard

Which TWO are best practices for model monitoring in production on AWS?

Select 2 answers
A.Disable logging to reduce latency
B.Use only CPU instances
C.Monitor input data drift
D.Retrain model daily
E.Monitor prediction drift
AnswersC, E

Data drift detection helps identify when the distribution of input data changes, affecting model accuracy.

Why this answer

Option C is correct because monitoring input data drift is a best practice for detecting changes in the distribution of incoming features compared to the training data. This helps identify when the model's assumptions about the data are no longer valid, which can degrade performance. AWS services like Amazon SageMaker Model Monitor can automatically track and alert on data drift.

Exam trap

Cisco often tests the misconception that retraining on a fixed schedule (e.g., daily) is a best practice, when in reality it should be event-driven based on drift or performance metrics.

6
MCQeasy

Refer to the exhibit. A data scientist ran a training job on Amazon SageMaker. The job failed with the error shown. What is the most likely cause?

A.The S3 input path is incorrect
B.The IAM role does not have permission to access S3
C.The training code has a syntax error
D.The batch size is too large for the instance's GPU memory
AnswerD

The error shows CUDA out of memory, typically due to batch size or model size exceeding GPU memory.

Why this answer

The error message indicates a CUDA out-of-memory error, which occurs when the GPU memory is insufficient for the requested batch size. Option D is correct because increasing the batch size beyond the GPU's memory capacity causes the training job to fail with this specific error.

Exam trap

AWS often tests the distinction between infrastructure errors (S3, IAM) and runtime errors (CUDA memory), where candidates mistakenly attribute a GPU memory error to a misconfiguration in data access or code syntax.

How to eliminate wrong answers

Option A is wrong because an incorrect S3 input path would result in a 'NoSuchKey' or '404' error, not a CUDA out-of-memory error. Option B is wrong because an IAM role lacking S3 permissions would produce an 'AccessDenied' error, not a GPU memory error. Option C is wrong because a syntax error in the training code would raise a Python exception (e.g., SyntaxError) before any GPU operations, not a CUDA memory error.

7
Multi-Selecthard

Which TWO factors should be considered when choosing between a CPU-based instance and a GPU-based instance for training a machine learning model on Amazon SageMaker? (Choose two.)

Select 2 answers
A.The number of layers in the model
B.The AWS Region
C.The size of the dataset
D.The choice of hyperparameter optimizer
E.The type of model architecture (e.g., CNN vs. linear regression)
AnswersC, E

Large datasets can leverage GPU parallelism.

Why this answer

Option C is correct because the size of the dataset directly impacts whether a GPU's parallel processing capabilities are beneficial. GPU instances excel at performing many matrix operations simultaneously, which is critical for large datasets where mini-batch gradient descent can be parallelized. For smaller datasets, the overhead of transferring data to GPU memory may negate the performance gains, making CPU instances more cost-effective.

Exam trap

Cisco often tests the misconception that model architecture alone (e.g., number of layers) dictates hardware choice, when in fact the dataset size and model type (e.g., CNN vs. linear regression) are the key factors that determine whether GPU parallelism provides a meaningful advantage.

8
MCQmedium

A company is training a large language model using Amazon SageMaker. The training job fails with the error 'OutOfMemory'. They are using a single ml.p3.2xlarge instance. The dataset is 50GB and the model is 2GB. The training script uses standard data loading. Which action should they take to resolve the issue?

A.Increase the instance type to ml.p3.16xlarge
B.Train the model using Spot instances
C.Reduce the batch size
D.Use SageMaker's Pipe mode for data loading
AnswerA

The error indicates the instance memory is insufficient. Upgrading to a larger instance directly addresses the out-of-memory issue.

Why this answer

The error 'OutOfMemory' indicates that the ml.p3.2xlarge instance (with 16 GB GPU memory) cannot hold both the 2 GB model and the 50 GB dataset during training. Increasing the instance type to ml.p3.16xlarge provides 64 GB GPU memory, which is sufficient to accommodate the model and dataset without memory pressure. This directly resolves the resource constraint.

Exam trap

Cisco often tests the misconception that reducing batch size or using Pipe mode can solve out-of-memory errors caused by insufficient GPU memory, when the real fix is to use a larger instance with more GPU memory.

How to eliminate wrong answers

Option B is wrong because Spot instances provide cost savings but do not increase memory capacity; they use the same instance types and would still run out of memory. Option C is wrong because reducing the batch size reduces memory usage per step but does not address the fundamental issue that the total dataset (50 GB) cannot fit into the 16 GB GPU memory of the current instance; the model alone is 2 GB, leaving insufficient room for data. Option D is wrong because SageMaker's Pipe mode streams data directly from Amazon S3 to the training algorithm without storing it on disk, but the GPU memory is still required to hold the model and the data batches during processing; Pipe mode does not reduce GPU memory consumption.

9
Multi-Selecteasy

A data scientist wants to deploy a custom model built with TensorFlow to Amazon SageMaker for real-time inference. Which TWO steps are required? (Choose two.)

Select 2 answers
A.Create an Amazon ECR repository for the inference container
B.Upload the model artifacts to an S3 bucket
C.Submit a training job to SageMaker
D.Create a SageMaker endpoint configuration
E.Convert the model to ONNX format
AnswersB, D

Model artifacts must be stored in S3 for SageMaker to access.

Why this answer

Option B is correct because SageMaker requires model artifacts (the trained model files) to be stored in an S3 bucket before they can be used for inference. When deploying a custom TensorFlow model, you must upload the saved model (e.g., in SavedModel format) to S3, and then SageMaker will download it to the inference container during endpoint creation.

Exam trap

The trap here is that candidates often think they must build a custom container (Option A) or convert the model (Option E), but SageMaker's pre-built TensorFlow containers eliminate those steps, and the key requirements are simply uploading artifacts to S3 and creating the endpoint configuration.

10
MCQeasy

A startup wants to build a product recommendation engine for their e-commerce platform. They have user purchase history and item metadata. They want a fully managed solution that can automatically train and deploy a recommendation model without needing to manage the underlying ML lifecycle. The solution should provide personalized recommendations based on collaborative filtering. Which AWS service should they use?

A.Use Amazon Kendra
B.Use Amazon Lex
C.Use Amazon Personalize
D.Use Amazon SageMaker built-in Factorization Machines algorithm
AnswerC

Personalize is fully managed and specifically designed for recommendation systems.

Why this answer

Amazon Personalize is a fully managed service that enables you to build and deploy recommendation models without managing the underlying ML lifecycle. It supports collaborative filtering out of the box, using user purchase history and item metadata to generate personalized recommendations, which directly matches the startup's requirements.

Exam trap

The trap here is that candidates may confuse Amazon SageMaker's built-in algorithms (like Factorization Machines) with a fully managed recommendation service, overlooking the requirement for automatic lifecycle management and instead focusing only on the algorithm capability.

How to eliminate wrong answers

Option A is wrong because Amazon Kendra is an intelligent search service that uses natural language processing to answer questions and retrieve documents, not a recommendation engine for collaborative filtering. Option B is wrong because Amazon Lex is a service for building conversational interfaces (chatbots) using speech and text, not for generating product recommendations. Option D is wrong because while Amazon SageMaker's built-in Factorization Machines algorithm can perform collaborative filtering, it requires you to manage the ML lifecycle (data preparation, training, deployment, scaling), which contradicts the requirement for a fully managed solution that automatically handles these tasks.

11
MCQmedium

A data scientist is training a binary classification model to predict customer churn. The dataset has 10,000 records with 9,500 non-churners and 500 churners. After training a logistic regression model, the model achieves 95% accuracy on the test set. However, the business team reports that the model is not useful because it predicts almost all customers as non-churners. Which metric should the data scientist use to evaluate the model's performance in this scenario?

A.Accuracy
B.R-squared
C.Precision
D.Recall
AnswerD

Recall measures the proportion of actual churners correctly identified, which is the key metric for this imbalanced problem.

Why this answer

Option D (Recall) is correct because in this highly imbalanced dataset (95% non-churners vs 5% churners), the model's 95% accuracy is misleading—it can achieve this by simply predicting the majority class (non-churner) for all samples. Recall measures the proportion of actual churners correctly identified (True Positives / (True Positives + False Negatives)), directly addressing the business need to detect churn. A high recall ensures the model captures most churners, even at the cost of some false positives.

Exam trap

Cisco often tests the misconception that high accuracy always indicates a good model, especially in imbalanced datasets, leading candidates to overlook metrics like recall or precision that better reflect model utility for the specific business problem.

How to eliminate wrong answers

Option A is wrong because accuracy is a poor metric for imbalanced datasets; a model that predicts all samples as the majority class can achieve high accuracy (95% here) while failing to identify any churners, making it useless for the business goal. Option B is wrong because R-squared is a metric for regression models, measuring the proportion of variance explained by the independent variables, and is not applicable to binary classification tasks like churn prediction. Option C is wrong because precision (True Positives / (True Positives + False Positives)) focuses on the correctness of positive predictions; while important, it does not capture the model's ability to find all churners—a model with high precision but low recall might still miss most churners, which is the core issue reported by the business team.

12
MCQeasy

Refer to the exhibit. A developer wants to ensure the notebook instance can access the internet to download packages. Which property configuration ensures this?

A.DirectInternetAccess: Enabled
B.VolumeSizeInGB: 5
C.InstanceType: ml.t2.medium
D.The resource type AWS::SageMaker::NotebookInstance
AnswerA

Setting DirectInternetAccess to Enabled allows the notebook instance to access the internet.

Why this answer

Option A is correct because setting `DirectInternetAccess: Enabled` on an AWS SageMaker notebook instance allows it to access the internet through a VPC with a Network Address Translation (NAT) gateway or via the public internet if the instance is not in a VPC. This configuration is required to download packages from external repositories like PyPI or conda.

Exam trap

AWS often tests the distinction between resource type identifiers and configurable properties, so candidates may mistakenly think that specifying `AWS::SageMaker::NotebookInstance` as the resource type itself enables internet access, rather than recognizing it as a CloudFormation resource declaration.

How to eliminate wrong answers

Option B is wrong because `VolumeSizeInGB: 5` only specifies the size of the Amazon EBS storage volume attached to the notebook instance, which does not affect internet connectivity. Option C is wrong because `InstanceType: ml.t2.medium` defines the compute capacity (CPU and memory) of the instance, not its network access capabilities. Option D is wrong because `AWS::SageMaker::NotebookInstance` is the resource type identifier in AWS CloudFormation, not a property that controls internet access.

13
MCQmedium

A data scientist is training a model using Amazon SageMaker and notices the training loss is decreasing but validation loss starts increasing after a few epochs. Which technique should they apply to address this?

A.Increase batch size
B.Increase the learning rate
C.Add more training data
D.Add regularization (e.g., L1 or L2)
AnswerD

Regularization penalizes large weights and reduces overfitting, which is indicated by increasing validation loss.

Why this answer

The scenario describes overfitting, where the model memorizes training data but fails to generalize to validation data. Adding regularization (L1 or L2) penalizes large weights, reducing model complexity and improving generalization. This is a standard technique in SageMaker training jobs, often configured via the `regularizer` hyperparameter in frameworks like TensorFlow or MXNet.

Exam trap

The trap here is that candidates confuse overfitting with underfitting or optimization issues, and incorrectly choose to increase learning rate or batch size, not recognizing that rising validation loss with falling training loss is the classic signature of overfitting.

How to eliminate wrong answers

Option A is wrong because increasing batch size typically stabilizes gradient estimates but does not directly address overfitting; it may even reduce generalization by sharpening minima. Option B is wrong because increasing the learning rate can cause divergence or overshooting of the loss minimum, worsening both training and validation loss. Option C is wrong because adding more training data can help generalization but is not a direct fix for overfitting when validation loss increases; it may not be feasible or sufficient, and regularization is the immediate corrective action.

14
MCQhard

An e-commerce company stores user interaction logs in Amazon S3. They want to use machine learning to segment users based on purchasing behavior. Which unsupervised learning algorithm is most appropriate?

A.Linear regression
B.Random forest
C.K-means clustering
D.Neural network
AnswerC

Unsupervised algorithm that groups data into clusters based on similarity.

Why this answer

K-means clustering is the most appropriate unsupervised learning algorithm for segmenting users based on purchasing behavior because it groups data points into clusters based on feature similarity without requiring labeled training data. The e-commerce scenario involves discovering natural groupings (segments) in user interaction logs, which is a classic clustering task, and K-means efficiently partitions users into K distinct segments by minimizing within-cluster variance.

Exam trap

Cisco often tests the distinction between supervised and unsupervised learning by presenting a clustering problem and including supervised algorithms as distractors, leading candidates to mistakenly pick a familiar algorithm like random forest or linear regression without recognizing the lack of labeled data.

How to eliminate wrong answers

Option A is wrong because linear regression is a supervised learning algorithm used for predicting continuous numeric values (e.g., sales amount) from labeled data, not for discovering unlabeled user segments. Option B is wrong because random forest is a supervised ensemble learning method used for classification or regression on labeled datasets, and it cannot perform unsupervised segmentation without target labels. Option D is wrong because neural networks are typically used in supervised or reinforcement learning contexts; while they can be adapted for unsupervised tasks (e.g., autoencoders), they are not the most straightforward or appropriate choice for simple user segmentation compared to K-means clustering.

15
MCQhard

A team trains a model using Amazon SageMaker built-in XGBoost. After training, they want to evaluate feature importance. Which SageMaker feature allows them to view this?

A.SageMaker Debugger
B.SageMaker Experiments
C.SageMaker Autopilot
D.SageMaker Model Monitor
AnswerA

Debugger can capture internal model states like feature importance.

Why this answer

SageMaker Debugger provides built-in monitoring and visualization capabilities, including the ability to capture feature importance metrics (e.g., gain, cover, weight) from XGBoost training jobs. It automatically saves these metrics to Amazon S3 and allows you to view them through the SageMaker Studio Debugger dashboard or by querying the saved tensors, enabling direct evaluation of feature importance without additional custom code.

Exam trap

The trap here is that candidates often confuse SageMaker Experiments' tracking of training metrics (like accuracy or loss) with the ability to view model-specific internals like feature importance, which is a Debugger capability.

How to eliminate wrong answers

Option B (SageMaker Experiments) is wrong because it is designed for tracking and comparing training runs (e.g., hyperparameters, metrics, artifacts) but does not natively capture or expose feature importance values from the model. Option C (SageMaker Autopilot) is wrong because it automates the end-to-end ML pipeline (data preprocessing, model selection, hyperparameter tuning) and provides feature importance only as part of its generated candidate definition notebooks, not as a direct, real-time feature during a custom XGBoost training job. Option D (SageMaker Model Monitor) is wrong because it focuses on detecting data drift and model quality degradation in production deployments, not on extracting feature importance from a trained model.

16
MCQmedium

A financial services company needs to ensure that the machine learning models used for loan approval are explainable and meet regulatory compliance. Which AWS feature can help explain model predictions?

A.SageMaker Ground Truth
B.SageMaker Clarify
C.SageMaker Automatic Model Tuning
D.SageMaker Model Monitor
AnswerB

Clarify provides feature importance, SHAP values, and bias metrics for model explainability.

Why this answer

SageMaker Clarify is the correct AWS service for explaining model predictions because it provides feature attribution and bias detection capabilities. It uses SHAP (SHapley Additive exPlanations) to generate explainability reports, which are essential for meeting regulatory compliance in financial services like loan approval.

Exam trap

The trap here is confusing monitoring (Model Monitor) with explainability (Clarify), as both relate to model governance but serve fundamentally different purposes—monitoring tracks performance over time, while Clarify explains individual predictions.

How to eliminate wrong answers

Option A is wrong because SageMaker Ground Truth is a data labeling service for creating training datasets, not for explaining model predictions. Option C is wrong because SageMaker Automatic Model Tuning (hyperparameter optimization) adjusts model parameters to improve performance, but does not provide explainability or feature attribution. Option D is wrong because SageMaker Model Monitor detects data drift and model quality degradation over time, but does not generate explanations for individual predictions.

17
MCQmedium

A company wants to build a model to forecast monthly sales. The data is a time series with trend and seasonality. Which SageMaker algorithm is most appropriate?

A.XGBoost
B.K-Means
C.Linear Learner
D.DeepAR
AnswerD

DeepAR is a built-in SageMaker algorithm specifically for time series forecasting with seasonality and trend.

Why this answer

DeepAR is the most appropriate algorithm because it is specifically designed for time series forecasting, handling both trend and seasonality through autoregressive recurrent neural networks. It learns from multiple related time series and produces probabilistic forecasts, making it ideal for monthly sales prediction.

Exam trap

The trap here is that candidates often choose XGBoost or Linear Learner because they are familiar with regression tasks, but fail to recognize that time series forecasting requires algorithms that explicitly model temporal dependencies and seasonality, which DeepAR is built for.

How to eliminate wrong answers

Option A is wrong because XGBoost is a gradient boosting algorithm for tabular data, not designed to capture temporal dependencies or seasonality in time series without extensive feature engineering. Option B is wrong because K-Means is an unsupervised clustering algorithm that groups data points by similarity, with no capability for forecasting sequential data. Option C is wrong because Linear Learner is a linear regression model that assumes independence of observations and cannot model complex time series patterns like seasonality or long-term trends.

18
MCQhard

A company is deploying a machine learning model for real-time fraud detection. The model must make predictions with latency under 10 milliseconds. The data scientist trained a gradient boosting model that achieves high accuracy but has inference latency of 50 milliseconds. The team has access to a larger instance type with more CPU cores. Which approach should the data scientist take to reduce inference latency while maintaining accuracy?

A.Switch to batch inference and run predictions every 100 milliseconds.
B.Deploy the model on a larger instance with more CPU cores.
C.Reduce the maximum tree depth and retrain the model.
D.Apply post-training pruning to remove redundant trees.
AnswerB

More CPU cores allow parallel computation, reducing inference latency without changing the model.

Why this answer

Option B is correct because increasing the number of CPU cores allows the gradient boosting model to parallelize tree evaluation across multiple cores, reducing inference latency. Since the model is already trained and accurate, this hardware scaling directly addresses the 50 ms bottleneck without altering the model's structure or accuracy.

Exam trap

AWS often tests the misconception that model optimization (pruning or depth reduction) is the only way to reduce latency, ignoring that hardware scaling (more CPU cores) can meet latency requirements without sacrificing accuracy.

How to eliminate wrong answers

Option A is wrong because switching to batch inference every 100 ms violates the real-time requirement of under 10 ms latency per prediction; it introduces a fixed delay that exceeds the threshold. Option C is wrong because reducing maximum tree depth reduces model complexity, which can lower accuracy and may not guarantee latency under 10 ms if the number of trees remains high. Option D is wrong because post-training pruning removes trees, which reduces model size but can degrade accuracy, and the latency improvement may be insufficient if the remaining trees still require sequential evaluation on limited cores.

19
MCQmedium

A team is training a binary classification model using Amazon SageMaker. They notice that the training accuracy is 99% but the test accuracy is only 70%. Which technique should they apply first to address this?

A.Reduce training data
B.Apply regularization
C.Increase learning rate
D.Increase model complexity
AnswerB

Regularization adds penalty for large weights, helping to reduce overfitting.

Why this answer

The high training accuracy (99%) paired with significantly lower test accuracy (70%) is a classic symptom of overfitting, where the model memorizes the training data instead of learning generalizable patterns. Regularization (Option B) is the first-line technique to combat overfitting by adding a penalty to the loss function (e.g., L1 or L2 regularization), which discourages overly complex decision boundaries. In Amazon SageMaker, this can be implemented via hyperparameters like `l1` or `l2` in built-in algorithms or by adding dropout layers in a custom framework.

Exam trap

AWS often tests the misconception that overfitting is solved by increasing data or model complexity, when in fact the first step should be regularization to penalize overly complex models.

How to eliminate wrong answers

Option A is wrong because reducing training data would worsen overfitting by providing the model with fewer examples to learn from, making it even more prone to memorization. Option C is wrong because increasing the learning rate can cause the model to overshoot optimal weights during training, leading to divergence or poor convergence, but it does not directly address the variance problem of overfitting. Option D is wrong because increasing model complexity (e.g., adding more layers or parameters) would exacerbate overfitting by giving the model more capacity to memorize noise in the training data.

20
MCQmedium

A company wants to use Amazon SageMaker to train a model using a custom Docker container that has specific dependencies. The training code is stored in an S3 bucket. Which steps must be taken to run the training job?

A.Install dependencies via SageMaker's lifecycle configuration instead of a custom container
B.Push the custom container to Amazon ECR and create a training job with the container URI
C.Use SageMaker's built-in framework container and override the entry point
D.Upload the container to S3 and reference it in the training job
AnswerB

ECR is the correct registry for Docker images used in SageMaker.

Why this answer

Amazon SageMaker requires custom Docker containers to be stored in Amazon Elastic Container Registry (ECR) to run training jobs. The container URI from ECR is specified in the `AlgorithmSpecification` parameter of the `CreateTrainingJob` API call, allowing SageMaker to pull and execute the container with the training code from S3. Option B correctly describes this mandatory workflow.

Exam trap

AWS often tests the misconception that any S3-uploaded artifact (including Docker images) can be directly referenced in a training job, but SageMaker strictly requires container images to be stored in ECR, not S3.

How to eliminate wrong answers

Option A is wrong because lifecycle configurations are used to customize notebook instances (e.g., install packages on Jupyter kernels), not to provide dependencies for training jobs; training jobs run in ephemeral containers that do not use lifecycle configurations. Option C is wrong because overriding the entry point of a built-in framework container only works if the container already includes the required dependencies; if custom dependencies are needed, a custom container must be built and pushed to ECR. Option D is wrong because SageMaker does not accept Docker containers stored in S3; containers must be registered in ECR and referenced by their URI.

21
Multi-Selectmedium

A data scientist is evaluating different AWS services for building a machine learning pipeline. Which THREE components are part of Amazon SageMaker? (Select THREE.)

Select 3 answers
A.AWS Glue
B.Notebook instances
C.Ground Truth
D.Model registry
E.Amazon Athena
AnswersB, C, D

SageMaker Notebook Instances are fully managed Jupyter notebooks.

Why this answer

Amazon SageMaker provides fully managed notebook instances that allow data scientists to spin up Jupyter notebooks for data exploration, preprocessing, and model development without managing underlying infrastructure. These instances come pre-installed with common ML frameworks and can be easily scaled.

Exam trap

The trap here is that candidates often confuse AWS Glue (a separate ETL service) as part of SageMaker because both are used in ML pipelines, but Glue is not a SageMaker component.

22
MCQeasy

A company is using Amazon Comprehend for sentiment analysis on customer reviews. They notice that the sentiment is often incorrect for negative reviews with sarcasm. What is the likely cause?

A.The model is not fine-tuned for the domain
B.The pre-trained model cannot handle sarcasm well
C.Insufficient training data
D.The input text is too long
AnswerB

Sarcasm detection is a known limitation of general-purpose sentiment analysis models.

Why this answer

Amazon Comprehend's pre-trained sentiment analysis models are trained on general text corpora and lack the ability to detect sarcasm, which relies on contextual cues, tone, and figurative language. Sarcasm often inverts the literal sentiment (e.g., 'Great job, as always' for a failure), and standard NLP models without explicit sarcasm detection or fine-tuning cannot reliably interpret this inversion. Therefore, the likely cause is that the pre-trained model cannot handle sarcasm well.

Exam trap

Cisco often tests the misconception that 'fine-tuning' or 'more data' can fix any NLP issue, but here the trap is that sarcasm is a distinct linguistic challenge that pre-trained models inherently fail at, regardless of domain or data volume, unless specifically addressed with sarcasm-aware training or custom classifiers.

How to eliminate wrong answers

Option A is wrong because while fine-tuning can improve domain-specific accuracy, the core issue here is not domain mismatch but the model's inherent inability to detect sarcasm—a linguistic phenomenon that even domain-tuned models struggle with unless specifically trained on sarcastic examples. Option C is wrong because insufficient training data is not the primary cause; Amazon Comprehend's pre-trained model is trained on vast datasets, but sarcasm detection requires specialized training data and architectures (e.g., contrastive learning) that the base model lacks. Option D is wrong because input text length is not the issue; Comprehend handles up to 5,000 UTF-8 characters per request, and sarcasm is a semantic problem, not a truncation or length-related one.

23
MCQmedium

An organization wants to detect anomalies in real-time streaming data from IoT devices. The data includes sensor readings, and the team plans to use a machine learning model. Which AWS service should be used to build and deploy the model with minimal operational overhead?

A.Amazon SageMaker
B.AWS Glue
C.Amazon QuickSight
D.Amazon Kinesis Data Analytics
AnswerA

SageMaker offers end-to-end ML capabilities and can deploy real-time endpoints.

Why this answer

Amazon SageMaker is the correct choice because it provides a fully managed environment for building, training, and deploying machine learning models at scale. For real-time anomaly detection on streaming IoT data, SageMaker can host a trained model as a real-time endpoint that processes incoming sensor readings via Amazon Kinesis Data Streams or AWS Lambda, minimizing operational overhead by handling infrastructure, scaling, and monitoring automatically.

Exam trap

AWS often tests the misconception that Amazon Kinesis Data Analytics can build and deploy custom ML models, when in fact it only supports built-in ML functions for simple anomaly detection and cannot train or host custom models.

How to eliminate wrong answers

Option B (AWS Glue) is wrong because it is a serverless data integration and ETL service for preparing and transforming batch data, not for building or deploying machine learning models for real-time anomaly detection. Option C (Amazon QuickSight) is wrong because it is a business intelligence (BI) service for visualizing and analyzing data, not for building or deploying ML models. Option D (Amazon Kinesis Data Analytics) is wrong because it is designed for real-time stream processing using SQL or Apache Flink, but it does not provide the capability to build, train, or deploy custom machine learning models; it is limited to built-in ML functions like anomaly detection on simple metrics, not custom model deployment.

24
MCQmedium

A data scientist is using Amazon SageMaker to train a deep learning model for image classification. The training job is taking too long. The dataset consists of 100,000 images stored in Amazon S3. Which action can the data scientist take to reduce training time without modifying the model architecture?

A.Convert images to CSV format before training.
B.Use a GPU instance type for training.
C.Enable checkpointing to save intermediate models.
D.Reduce the number of training epochs.
AnswerB

GPUs are optimized for parallel matrix operations common in deep learning, significantly reducing training time.

Why this answer

Option B is correct because GPU instances are specifically designed for parallel processing of matrix operations, which are fundamental to deep learning training. By switching to a GPU instance type (e.g., p3 or p4d families) in SageMaker, the data scientist can significantly accelerate the training of the image classification model without altering the model architecture, as the dataset of 100,000 images benefits from GPU's massive parallelism for forward and backward passes.

Exam trap

The trap here is that candidates may confuse checkpointing (which helps with recovery, not speed) or reducing epochs (which changes training duration but also model performance) with legitimate performance optimizations, while overlooking that GPU acceleration directly addresses the computational bottleneck without altering the model or dataset.

How to eliminate wrong answers

Option A is wrong because converting images to CSV format would increase data size, lose spatial structure, and introduce unnecessary serialization overhead, making training slower, not faster. Option C is wrong because checkpointing saves intermediate model states for fault tolerance or resumption, but it does not reduce training time; it may even add overhead due to I/O operations. Option D is wrong because reducing the number of training epochs would change the training process and likely degrade model accuracy, which violates the constraint of not modifying the model architecture (epochs are a hyperparameter, not part of architecture, but the question implies no changes that affect training duration by reducing work).

25
MCQhard

Refer to the exhibit. A data scientist is training a neural network model on SageMaker. The training log shows the loss values per epoch. Which issue is most likely occurring?

A.The number of epochs is insufficient
B.The model is overfitting
C.The dataset is too small
D.The learning rate is too low
AnswerB

Overfitting causes training loss to increase after a point.

Why this answer

The training log shows loss values decreasing on the training set but increasing or plateauing on the validation set, which is a classic sign of overfitting. Overfitting occurs when the model learns noise and specific patterns in the training data too well, failing to generalize to unseen data. In SageMaker, monitoring both training and validation loss curves is critical to detect this issue early.

Exam trap

Cisco often tests the distinction between overfitting and underfitting by showing loss curves where training loss decreases but validation loss increases, leading candidates to mistakenly attribute the issue to insufficient epochs or a low learning rate.

How to eliminate wrong answers

Option A is wrong because insufficient epochs would typically show both training and validation loss still decreasing at the end of training, not a divergence. Option C is wrong because a dataset that is too small can contribute to overfitting, but the direct symptom shown in the loss curves (training loss decreasing while validation loss increases) is specifically overfitting, not merely a small dataset. Option D is wrong because a learning rate that is too low would cause both training and validation loss to decrease very slowly or plateau at a high value, not diverge.

26
MCQeasy

A data scientist at a retail company is tasked with building a model to predict customer churn. The dataset contains 100,000 records with features such as age, purchase history, customer support interactions, and a binary label indicating whether the customer churned in the past. The team needs a model that can be deployed for real-time inference with low latency. They have limited time and want to use a built-in algorithm from Amazon SageMaker that is optimized for classification tasks. Which approach should they take?

A.Use Amazon SageMaker PCA algorithm
B.Use Amazon SageMaker XGBoost algorithm
C.Use Amazon SageMaker K-Means algorithm
D.Use Amazon SageMaker BlazingText algorithm
AnswerB

XGBoost is a built-in algorithm for classification and works well with tabular data.

Why this answer

Amazon SageMaker's built-in XGBoost algorithm is optimized for classification tasks like binary churn prediction, supports real-time inference with low latency via SageMaker endpoints, and can handle the dataset size of 100,000 records efficiently. It is a supervised learning algorithm that directly uses the binary label for training, making it the correct choice for this scenario.

Exam trap

The trap here is that candidates may confuse unsupervised algorithms (PCA, K-Means) or domain-specific algorithms (BlazingText for text) with general-purpose supervised classification algorithms, overlooking that XGBoost is the only built-in SageMaker algorithm among the options designed for tabular classification with real-time inference needs.

How to eliminate wrong answers

Option A is wrong because PCA (Principal Component Analysis) is an unsupervised dimensionality reduction algorithm, not a classification algorithm, and cannot predict churn from a binary label. Option C is wrong because K-Means is an unsupervised clustering algorithm used for grouping data, not for supervised classification tasks like churn prediction. Option D is wrong because BlazingText is optimized for text classification and word embeddings, not for tabular data with features like age and purchase history.

27
Multi-Selectmedium

Which TWO of the following are examples of supervised learning tasks that can be performed using Amazon SageMaker built-in algorithms?

Select 2 answers
A.Principal Component Analysis (PCA)
B.XGBoost
C.Linear Learner
D.Latent Dirichlet Allocation (LDA)
E.K-Means
AnswersB, C

XGBoost is a supervised gradient boosting algorithm.

Why this answer

XGBoost is a supervised learning algorithm that uses gradient-boosted decision trees for regression, classification, and ranking tasks. Amazon SageMaker's built-in XGBoost algorithm is optimized for distributed training and directly supports labeled training data, making it a correct example of a supervised learning task.

Exam trap

Cisco often tests the distinction between supervised and unsupervised learning by listing algorithms like PCA, LDA, and K-Means alongside supervised ones, trapping candidates who recognize the algorithm names but forget their learning paradigm.

28
MCQhard

A company wants to deploy a real-time inference endpoint for a custom model on SageMaker. The model has high latency (100ms) and they need to handle variable traffic with spikes. Which deployment strategy is most cost-effective?

A.Deploy on a SageMaker multi-model endpoint
B.Use batch transform
C.Deploy on a single SageMaker endpoint with automatic scaling
D.Use a single large instance type
AnswerC

Automatic scaling adds instances based on load, providing cost-effective handling of variable traffic.

Why this answer

Option C is correct because a single SageMaker endpoint with automatic scaling allows the endpoint to dynamically adjust the number of instances based on traffic patterns, handling variable traffic and spikes cost-effectively. For a model with 100ms latency, automatic scaling can add instances during spikes and remove them during low traffic, ensuring you only pay for the compute resources you use while maintaining low inference latency.

Exam trap

The trap here is that candidates often confuse multi-model endpoints with cost-effective scaling for a single model, not realizing that multi-model endpoints are designed for hosting many models, not for handling variable traffic for one model with high latency.

How to eliminate wrong answers

Option A is wrong because a multi-model endpoint is designed to host multiple models on a shared instance to reduce costs, but it does not inherently handle high-latency models (100ms) well under variable traffic spikes, as it can lead to resource contention and increased latency. Option B is wrong because batch transform is an asynchronous, offline inference method for processing large datasets in batches, not suitable for real-time inference endpoints that require immediate responses. Option D is wrong because using a single large instance type is not cost-effective for variable traffic with spikes; you would over-provision for peak traffic and pay for idle capacity during low traffic, whereas automatic scaling adjusts resources dynamically.

29
MCQmedium

During model training, the loss decreases rapidly for the first few epochs and then plateaus. The validation loss starts increasing after some epochs. What should the team do to improve generalization?

A.Increase learning rate
B.Early stopping
C.Increase training epochs
D.Add more layers
AnswerB

Early stopping stops training before overfitting occurs.

Why this answer

The validation loss increasing while training loss continues to decrease is a classic sign of overfitting. Early stopping (Option B) halts training when validation performance stops improving, preventing the model from memorizing noise in the training data and thereby improving generalization.

Exam trap

Cisco often tests the misconception that overfitting is solved by increasing model complexity or training longer, when in fact the opposite is true—early stopping or regularization techniques are required to curb overfitting.

How to eliminate wrong answers

Option A is wrong because increasing the learning rate would cause larger weight updates, potentially overshooting minima and destabilizing training, which does not address overfitting. Option C is wrong because increasing training epochs would allow the model to continue fitting the training data even more closely, exacerbating overfitting rather than improving generalization. Option D is wrong because adding more layers increases model capacity, making it more prone to overfitting on the training data, not less.

30
Multi-Selectmedium

Which TWO of the following are best practices for data preprocessing in machine learning? (Select TWO.)

Select 2 answers
A.Use cross-validation to evaluate model performance
B.Always split data 80/20 for training and testing
C.One-hot encoding for ordinal categories
D.Feature scaling for gradient-based algorithms
E.Drop duplicate records only if they are manual entry errors
AnswersA, D

Cross-validation provides a more reliable estimate of model generalization.

Why this answer

Cross-validation is a best practice for evaluating model performance because it provides a more robust estimate of how the model will generalize to unseen data by partitioning the data into multiple training and validation sets. This reduces the variance associated with a single train-test split and helps detect overfitting, making it a standard technique in machine learning workflows.

Exam trap

AWS often tests the misconception that one-hot encoding is universally applicable to all categorical data, but the trap here is that candidates forget ordinal categories have a natural order that one-hot encoding discards, leading to loss of information and potentially worse model performance.

31
MCQmedium

A retail company uses a machine learning model to forecast daily product demand. The model is a time series model that uses historical sales data. The model has been performing well, but recently the forecasts have been consistently too low, leading to stockouts. The data scientist notices that the model was trained on data up to last year, and the company has since launched a successful marketing campaign that increased sales by 20%. The data scientist needs to update the model to reflect the new sales patterns. Which approach should the data scientist take?

A.Add a feature for the marketing campaign and continue using the old model.
B.Switch to a different model type, such as ARIMA, without retraining.
C.Multiply the model's predictions by 1.2 to account for the marketing campaign.
D.Retrain the model on the most recent data that includes the sales from the marketing campaign.
AnswerD

Retraining on recent data allows the model to learn the new sales pattern and improve forecasts.

Why this answer

Option D is correct because retraining the model on the most recent data that includes the sales from the marketing campaign allows the model to learn the new underlying pattern in the time series. Since the model is a time series model, it relies on historical patterns to make forecasts; retraining on data that captures the 20% sales lift ensures the model adapts to the new demand level, reducing the persistent underforecasting.

Exam trap

The trap here is that candidates may think a simple multiplicative adjustment (Option C) is sufficient, but Cisco tests the understanding that time series models must be retrained on the new distribution to maintain forecast accuracy, as static adjustments ignore changes in the underlying data-generating process.

How to eliminate wrong answers

Option A is wrong because simply adding a feature for the marketing campaign without retraining the model does not update the model's learned parameters; the old model's weights are still based on pre-campaign data, so it cannot properly incorporate the new pattern. Option B is wrong because switching to a different model type like ARIMA without retraining means the new model has no learned parameters from the data, and ARIMA requires fitting to the specific time series to estimate its autoregressive and moving average components. Option C is wrong because multiplying predictions by a constant factor (1.2) assumes a uniform multiplicative effect that may not hold across all products or time periods, and it does not address potential changes in seasonality, trend, or other dynamics introduced by the campaign.

32
MCQhard

A data scientist wants to perform automatic model tuning (hyperparameter optimization) on SageMaker. They need to find the best hyperparameters for a gradient boosting model. Which strategy is BEST for this task?

A.Random search
B.Grid search
C.Exhaustive search
D.Bayesian optimization
AnswerD

Uses a probabilistic model to select hyperparameters, achieving better results with fewer iterations.

Why this answer

Bayesian optimization is the best strategy for automatic model tuning on SageMaker because it builds a probabilistic model of the objective function and uses it to select the most promising hyperparameters to evaluate next. This approach is far more sample-efficient than random or grid search, making it ideal for expensive-to-evaluate models like gradient boosting, where each training run consumes significant time and compute resources.

Exam trap

Cisco often tests the misconception that exhaustive or grid search is the most thorough and therefore best approach, but the trap is that they ignore the practical constraints of compute cost and time, making Bayesian optimization the superior choice for automatic model tuning in SageMaker.

How to eliminate wrong answers

Option A is wrong because random search, while better than grid search for high-dimensional spaces, does not use past evaluation results to inform future trials, making it less efficient than Bayesian optimization for finding optimal hyperparameters. Option B is wrong because grid search exhaustively evaluates all combinations of a predefined set of hyperparameter values, which is computationally prohibitive for gradient boosting models with many continuous hyperparameters and does not scale well. Option C is wrong because exhaustive search is essentially a synonym for grid search and suffers from the same curse of dimensionality, making it impractical for hyperparameter optimization in SageMaker's automatic model tuning context.

33
MCQhard

A financial services company needs to deploy a real-time fraud detection model with sub-100ms inference latency. The model is a large ensemble requiring 8 GB of memory per request. The workload has bursty traffic. Which Amazon SageMaker deployment strategy best meets these requirements?

A.Deploy behind an Application Load Balancer with multiple ml.m5.xlarge EC2 instances running the model
B.Use a single ml.r5.2xlarge instance with an auto-scaling policy based on CPU utilization
C.Use a SageMaker multi-model endpoint with ml.m5.large instances to cache multiple models
D.Use SageMaker asynchronous inference with a large batch size
AnswerB

A real-time endpoint with a large instance and auto-scaling handles bursty traffic and meets latency requirements.

Why this answer

Option B is correct because a single ml.r5.2xlarge instance provides 16 GB of memory, which can handle the 8 GB per request requirement, and SageMaker real-time endpoints with auto-scaling based on CPU utilization can dynamically adjust to bursty traffic while maintaining sub-100ms inference latency. This approach avoids the overhead of load balancers or multi-model caching that could introduce latency.

Exam trap

The trap here is that candidates may assume multi-model endpoints (Option C) are suitable for large models, but they are designed for many small models sharing memory, not for a single large ensemble requiring 8 GB per request.

How to eliminate wrong answers

Option A is wrong because deploying behind an Application Load Balancer with multiple ml.m5.xlarge instances adds network hop latency and does not leverage SageMaker's native endpoint routing, potentially exceeding the sub-100ms requirement; also, m5.xlarge instances have only 8 GB of memory, which may not handle the 8 GB per request without memory contention. Option C is wrong because SageMaker multi-model endpoints are designed for serving multiple smaller models from a shared instance, not for a single large ensemble requiring 8 GB per request, and ml.m5.large instances have only 4 GB of memory, insufficient for the workload. Option D is wrong because SageMaker asynchronous inference is intended for non-real-time, large payloads with minutes of latency, not sub-100ms real-time fraud detection, and batching would increase latency beyond the requirement.

34
MCQhard

Refer to the exhibit. A user tries to create a training job that reads data from a bucket named 'my-bucket'. The job fails with an access denied error. What is the most likely cause?

A.The s3:GetObject action is restricted to a specific bucket but the bucket ARN is incorrect
B.The sagemaker:CreateTrainingJob action is not allowed for the specific instance type
C.The s3:GetObject permission is missing the bucket ARN for listing objects
D.The IAM role does not have permission to write to the bucket
AnswerC

Without s3:ListBucket permission, the training job may fail when trying to list objects at the bucket level.

Why this answer

Option C is correct because the error 'access denied' when reading data from S3 typically indicates that the IAM role used by SageMaker lacks the necessary permissions. Specifically, the s3:GetObject action requires the bucket ARN in the resource element of the policy to allow listing and retrieving objects. Without the bucket ARN, the permission is incomplete, leading to the access denied error.

Exam trap

Cisco often tests the distinction between missing permissions (s3:GetObject) and incorrect ARN formatting, leading candidates to mistakenly choose Option A when the real issue is a missing bucket ARN in the resource element.

How to eliminate wrong answers

Option A is wrong because the s3:GetObject action being restricted to a specific bucket with an incorrect ARN would cause a different error (e.g., invalid ARN), not an access denied error; the issue is missing permissions, not incorrect ARN. Option B is wrong because the sagemaker:CreateTrainingJob action is not restricted by instance type in IAM policies; instance type restrictions are handled by service quotas or account limits, not IAM permissions. Option D is wrong because the error is about reading data (s3:GetObject), not writing; the IAM role does not need write permission to read from the bucket.

35
MCQeasy

A startup is building a recommendation engine for their e-commerce platform. They need a fully managed service that can generate personalized product recommendations based on user behavior. Which AWS service should they use?

A.Amazon Personalize
B.Amazon Rekognition
C.Amazon Forecast
D.Amazon Comprehend
AnswerA

Personalize is designed specifically for personalization and recommendations.

Why this answer

Amazon Personalize is a fully managed machine learning service specifically designed to generate real-time personalized product recommendations by processing user behavior data (e.g., clicks, purchases, views) and item metadata. It uses the same technology that powers Amazon.com's recommendation engine, making it the correct choice for this e-commerce use case.

Exam trap

Cisco often tests the distinction between AWS AI services by presenting a use case that sounds like 'forecasting' or 'analysis' but actually requires personalization, leading candidates to confuse Amazon Forecast (time-series) with Amazon Personalize (recommendations).

How to eliminate wrong answers

Option B (Amazon Rekognition) is wrong because it is a computer vision service for image and video analysis (e.g., object detection, facial recognition), not for generating product recommendations. Option C (Amazon Forecast) is wrong because it is a time-series forecasting service for predicting future metrics (e.g., demand, sales), not for personalized recommendations based on user behavior. Option D (Amazon Comprehend) is wrong because it is a natural language processing (NLP) service for extracting insights from text (e.g., sentiment, entities), not for recommendation generation.

36
MCQhard

A data scientist needs to preprocess categorical data with high cardinality (e.g., zip code with 50,000 unique values). Which technique is most appropriate?

A.Target encoding
B.Label encoding
C.Ordinal encoding
D.One-hot encoding
AnswerA

Target encoding replaces categories with the mean of the target variable, handling high cardinality effectively.

Why this answer

Option A is correct because target encoding uses the target variable to encode categories, reducing dimensionality while capturing predictive power. One-hot encoding (B) creates too many features. Label encoding (C) implies ordinal relationship which is unsuitable.

Ordinal encoding (D) also implies order.

37
Multi-Selecteasy

Which TWO services can be used to preprocess data for machine learning in AWS? (Choose two.)

Select 2 answers
A.AWS Glue
B.Amazon Athena
C.Amazon SageMaker Data Wrangler
D.Amazon Redshift
E.AWS Lambda
AnswersA, C

Glue provides ETL capabilities suitable for preprocessing.

Why this answer

AWS Glue is a fully managed ETL service that can be used to preprocess data for machine learning by cleaning, transforming, and enriching raw data before feeding it into ML models. It provides built-in transforms and can handle both structured and semi-structured data, making it suitable for preparing large datasets for training.

Exam trap

Cisco often tests the distinction between data querying services (like Athena) and data preprocessing services, leading candidates to mistakenly choose Athena because it can 'process' data via SQL, but it lacks the ML-specific transformation capabilities required for preprocessing.

38
MCQhard

A company is training a deep learning model on Amazon SageMaker using a custom Docker container. The training job fails with the error 'CannotStartContainerError: API error (500): failed to create shim task'. The team verifies that the container image is compatible with the selected instance type. What is the most likely cause of this error?

A.The instance type does not have enough memory for the container
B.The training data is stored in the wrong S3 bucket
C.The container image does not have the correct entry point
D.The GPU drivers are outdated
AnswerA

Insufficient memory is a common cause of container startup failures.

Why this answer

The error 'CannotStartContainerError: API error (500): failed to create shim task' typically occurs when the Docker container cannot be initialized due to resource constraints, most commonly insufficient memory on the selected instance type. Even if the container image is compatible with the instance, the container's memory request may exceed the available memory, causing the container runtime (containerd) to fail when creating the shim task. This is a known issue in SageMaker when the training job's resource requirements are not aligned with the instance's capacity.

Exam trap

The trap here is that candidates may attribute the error to image compatibility or entry point issues, but Cisco tests the understanding that container runtime errors like 'failed to create shim task' are almost always resource-related (memory or disk), not configuration or driver issues.

How to eliminate wrong answers

Option B is wrong because the training data being in the wrong S3 bucket would cause a data access error (e.g., 'NoSuchBucket' or 'AccessDenied'), not a container runtime error like 'CannotStartContainerError'. Option C is wrong because an incorrect entry point would result in a 'ContainerEntrypointError' or a process exit code error, not a shim task creation failure, which is a lower-level container runtime issue. Option D is wrong because outdated GPU drivers would cause a CUDA or GPU-related error (e.g., 'CUDA_ERROR_NO_DEVICE' or 'Driver/library mismatch'), not a generic container shim task failure, and SageMaker manages driver compatibility for supported instance types.

39
MCQmedium

A company is using Amazon Rekognition to detect objects in images. They need to detect custom objects that are specific to their domain. What should they do?

A.Use Amazon Rekognition's built-in labels
B.Use Amazon SageMaker Object Detection algorithm
C.Use Amazon Rekognition Custom Labels
D.Use Amazon Comprehend
AnswerC

Custom Labels enables training a custom model with labeled images.

Why this answer

Amazon Rekognition Custom Labels allows you to train a custom model using your own labeled images to detect domain-specific objects that are not covered by Rekognition's built-in labels. This is the correct service for custom object detection without needing to build a model from scratch.

Exam trap

The trap here is that candidates may confuse Amazon Rekognition Custom Labels with SageMaker Object Detection, not realizing that Custom Labels is a managed service specifically designed for custom image analysis without requiring ML expertise.

How to eliminate wrong answers

Option A is wrong because built-in labels are pre-trained on general categories and cannot detect custom domain-specific objects. Option B is wrong because Amazon SageMaker Object Detection algorithm requires you to build, train, and deploy a custom model from scratch, which is more complex and not the recommended approach when Rekognition Custom Labels can handle the task with less effort. Option D is wrong because Amazon Comprehend is a natural language processing (NLP) service for text analysis, not for image object detection.

40
MCQeasy

A company needs to store large amounts of unstructured training data (images, videos) in a cost-effective manner while ensuring low-latency retrieval for training jobs running on Amazon SageMaker. Which storage solution should be used?

A.Amazon EFS
B.Amazon S3
C.Amazon RDS
D.Amazon EBS
AnswerB

S3 is the best fit for storing unstructured data with low-latency access via S3 endpoints.

Why this answer

Amazon S3 is the correct choice because it is designed for cost-effective, scalable storage of unstructured data (images, videos) and integrates natively with Amazon SageMaker for low-latency data retrieval during training jobs. S3 provides high throughput and can be accessed directly from SageMaker training instances without the need for file system mounting, making it ideal for large-scale ML workloads.

Exam trap

The trap here is that candidates often confuse the need for low-latency retrieval with the need for a mounted file system (EFS or EBS), not realizing that S3's direct integration with SageMaker provides both low latency and high throughput for training workloads without the cost and complexity of file storage.

How to eliminate wrong answers

Option A is wrong because Amazon EFS is a file system that provides shared access for EC2 instances but is not optimized for the high-throughput, cost-effective storage of large unstructured datasets like images and videos; it also incurs higher costs per GB compared to S3 and can introduce latency overhead when used with SageMaker. Option C is wrong because Amazon RDS is a relational database service designed for structured data with SQL queries, not for storing unstructured training data such as images and videos, and it would be prohibitively expensive and inefficient for large-scale blob storage. Option D is wrong because Amazon EBS provides block-level storage volumes attached to a single EC2 instance, which is not suitable for sharing large datasets across multiple SageMaker training jobs and lacks the cost efficiency and scalability of object storage for unstructured data.

41
MCQmedium

Refer to the exhibit. A data scientist attaches the above IAM policy to a SageMaker notebook instance role. The notebook is in the same AWS account as the S3 bucket. When trying to read a file from 's3://my-bucket/training/data.csv', the data scientist gets an Access Denied error. What is the most likely cause?

A.The file name contains spaces
B.The policy does not grant 's3:ListBucket' permission
C.The S3 bucket is in a different Region
D.The policy allows 's3:PutObject' which is not needed
AnswerB

ListBucket is needed to access objects in a bucket.

Why this answer

The error occurs because the IAM policy attached to the SageMaker notebook instance role does not include the 's3:ListBucket' permission. When reading a specific object from S3, the SDK first performs a ListObjects (ListBucket) operation to resolve the object key, especially when using high-level APIs like boto3's `read()` or `get_object`. Without this permission, the request is denied even if 's3:GetObject' is allowed.

Exam trap

Cisco often tests the subtle distinction that reading a specific object requires both 's3:GetObject' and 's3:ListBucket' permissions because the SDK performs a bucket listing operation before accessing the object.

How to eliminate wrong answers

Option A is wrong because spaces in file names are valid in S3 and would not cause an Access Denied error; they would be URL-encoded automatically. Option C is wrong because the S3 bucket is in the same AWS account as the notebook, and cross-Region access does not cause Access Denied errors—it works normally with proper permissions. Option D is wrong because having an unnecessary permission like 's3:PutObject' does not cause an Access Denied error; it simply grants more access than needed.

42
MCQeasy

A company wants to build a system that automatically categorizes customer support tickets into predefined categories (e.g., billing, technical, account). The team has a large dataset of historical tickets with their category labels. Which type of machine learning problem is this?

A.Regression
B.Binary classification
C.Multi-class classification
D.Clustering
AnswerC

The problem involves predicting one of several discrete categories using labeled training data.

Why this answer

This is a multi-class classification problem because the model must assign each support ticket to one of three or more predefined categories (e.g., billing, technical, account). The dataset provides labeled historical tickets, making it a supervised learning task, and the output is a discrete class label from a set of more than two categories, which distinguishes it from binary classification.

Exam trap

Cisco often tests the distinction between binary and multi-class classification by presenting a scenario with multiple categories but implying a simple yes/no decision, leading candidates to mistakenly choose binary classification when the number of classes exceeds two.

How to eliminate wrong answers

Option A is wrong because regression predicts continuous numerical values (e.g., ticket resolution time), not discrete categories. Option B is wrong because binary classification only handles two classes (e.g., spam vs. not spam), whereas this problem involves three or more categories. Option D is wrong because clustering is an unsupervised learning technique that groups data without using predefined labels, while this problem uses labeled historical data for supervised learning.

43
MCQhard

A healthcare company is using Amazon SageMaker to deploy a model that makes predictions on patient data. They need to ensure that the model's predictions are explainable to comply with regulations. Which approach should they take?

A.Use SageMaker Model Monitor to track predictions
B.Use SageMaker Experiments to log model parameters
C.Use SageMaker Clarify to generate feature importance and explanations
D.Use SageMaker Debugger to analyze training gradients
AnswerC

Clarify provides model explainability, including SHAP and partial dependence plots.

Why this answer

SageMaker Clarify is specifically designed to provide model explainability, including feature importance and SHAP-based explanations, which are essential for regulatory compliance in healthcare. It helps stakeholders understand why a model made a particular prediction, addressing transparency requirements.

Exam trap

Cisco often tests the distinction between monitoring (Model Monitor), tracking (Experiments), debugging (Debugger), and explainability (Clarify), so the trap here is confusing operational monitoring with the need for interpretable explanations required by compliance frameworks.

How to eliminate wrong answers

Option A is wrong because SageMaker Model Monitor is used for detecting data drift and model quality degradation over time, not for generating per-prediction explanations. Option B is wrong because SageMaker Experiments tracks and organizes model training runs and parameters, but does not produce explainability reports for individual predictions. Option D is wrong because SageMaker Debugger monitors training metrics and gradients to debug training issues, not to explain model predictions post-deployment.

44
MCQmedium

A marketing agency wants to analyze customer feedback from social media posts to gauge sentiment. They have no labeled data and limited ML expertise. The team needs a managed service that provides pre-trained models for sentiment analysis without requiring them to train or manage infrastructure. They also need to process text in multiple languages. Which AWS service should they use?

A.Use Amazon Comprehend with its default sentiment analysis model
B.Use Amazon SageMaker to train a custom sentiment analysis model
C.Use AWS Glue to build a custom NLP pipeline
D.Use Amazon Rekognition for text analysis
AnswerA

Comprehend provides pre-trained models that work out of the box for sentiment analysis.

Why this answer

Amazon Comprehend is a fully managed natural language processing (NLP) service that provides pre-trained models for sentiment analysis, key phrase extraction, and language detection. It requires no labeled data, no model training, and no infrastructure management, making it ideal for teams with limited ML expertise. Comprehend natively supports multiple languages, including Spanish, French, German, and many others, directly addressing the requirement to process text in multiple languages.

Exam trap

The trap here is that candidates may confuse Amazon Rekognition (image/video analysis) with text analysis services, or assume that any AWS ML service (like SageMaker or Glue) can handle NLP tasks without recognizing the specific managed service designed for unstructured text.

How to eliminate wrong answers

Option B is wrong because Amazon SageMaker is a platform for building, training, and deploying custom ML models, which requires labeled data, ML expertise, and infrastructure management—contradicting the requirements for a pre-trained, managed service with no training. Option C is wrong because AWS Glue is a serverless data integration and ETL service, not an NLP service; it cannot perform sentiment analysis or provide pre-trained models. Option D is wrong because Amazon Rekognition is a computer vision service for analyzing images and videos, not for text analysis or sentiment detection.

45
MCQeasy

A company is using Amazon SageMaker to train a model. They want to automatically stop training if the model performance stops improving on a validation dataset. Which SageMaker feature should they enable?

A.Early stopping in hyperparameter tuning
B.SageMaker Experiments
C.SageMaker Debugger
D.SageMaker Model Monitor
AnswerA

Early stopping terminates poorly performing training jobs based on validation metrics.

Why this answer

Option A is correct because Amazon SageMaker's hyperparameter tuning jobs support an 'early stopping' feature that automatically halts training when the model's performance on the validation dataset ceases to improve. This is enabled by setting the `EarlyStoppingType` parameter to `Auto` or `Off` in the tuning job configuration, which uses algorithms like median stopping or Bayesian optimization to detect convergence and prevent wasted compute.

Exam trap

Cisco often tests the distinction between monitoring (Debugger) and automated stopping (early stopping in hyperparameter tuning), so candidates mistakenly choose Debugger because it 'monitors' performance, but it lacks the built-in auto-stop capability that hyperparameter tuning provides.

How to eliminate wrong answers

Option B is wrong because SageMaker Experiments is a feature for organizing, tracking, and comparing ML runs (e.g., trials and components), not for automatically stopping training based on validation performance. Option C is wrong because SageMaker Debugger monitors training metrics and system resources in real time, but it does not automatically stop training; it can only emit alerts or trigger actions via rules, but stopping requires custom logic or integration with other services. Option D is wrong because SageMaker Model Monitor is designed for detecting data drift and quality issues in deployed models, not for controlling the training lifecycle or stopping training jobs.

46
Multi-Selectmedium

Which THREE are SageMaker built-in algorithms suitable for regression tasks?

Select 3 answers
A.Linear Learner
B.K-Means
C.PCA
D.DeepAR
E.XGBoost
AnswersA, D, E

Linear Learner supports regression.

Why this answer

Linear Learner is a SageMaker built-in algorithm that supports both regression and classification tasks. For regression, it models the target variable as a linear combination of input features, optimizing for metrics like mean squared error. It is suitable for regression because it directly outputs continuous values.

Exam trap

Cisco often tests the distinction between supervised and unsupervised algorithms, and the trap here is that candidates may confuse dimensionality reduction (PCA) or clustering (K-Means) with regression tasks, assuming any algorithm that processes numeric data can perform regression.

47
MCQeasy

A data scientist wants to host a pre-trained model on Amazon SageMaker for real-time inference with minimal latency. Which approach should they use?

A.Run inference using AWS Lambda with the model packaged as a container
B.Use SageMaker batch transform
C.Create a SageMaker asynchronous inference endpoint
D.Deploy the model on a SageMaker real-time endpoint
AnswerD

Real-time endpoints are designed for low-latency, synchronous inference.

Why this answer

Option D is correct because SageMaker real-time endpoints are designed for low-latency, synchronous inference. They keep the model loaded and ready to respond to individual requests, making them ideal for real-time applications where minimal latency is critical.

Exam trap

Cisco often tests the distinction between synchronous (real-time) and asynchronous inference patterns, and the trap here is that candidates may confuse 'asynchronous inference' with 'real-time' because both can handle requests, but only real-time endpoints guarantee minimal latency for individual predictions.

How to eliminate wrong answers

Option A is wrong because AWS Lambda has a maximum execution timeout of 15 minutes and limited memory (up to 10 GB), making it unsuitable for hosting large pre-trained models that require persistent, low-latency inference. Option B is wrong because SageMaker batch transform is an asynchronous, offline process for processing large datasets in batches, not for real-time inference with minimal latency. Option C is wrong because SageMaker asynchronous inference endpoints are designed for requests with large payloads and longer processing times, where immediate response is not required; they introduce queuing and processing delays that are incompatible with minimal latency requirements.

48
Multi-Selecthard

A data engineer is using Amazon SageMaker Data Wrangler to prepare tabular data for ML. Which THREE data transformations are natively supported? (Choose three.)

Select 3 answers
A.One-hot encoding for categorical features
B.Audio feature extraction
C.Text vectorization using TF-IDF
D.Custom Python code via Pandas or Spark
E.Image resizing and normalization
AnswersA, C, D

One-hot encoding is a built-in transform in Data Wrangler.

Why this answer

Option A is correct because Amazon SageMaker Data Wrangler includes built-in support for one-hot encoding as a native transformation for categorical features. This transformation automatically creates binary columns for each category, which is essential for preparing tabular data for machine learning models that require numerical input.

Exam trap

AWS often tests the distinction between natively supported transformations in SageMaker Data Wrangler versus those requiring external services or custom scripts, leading candidates to mistakenly select audio or image processing options that are not part of Data Wrangler's built-in capabilities.

49
Multi-Selectmedium

Which AWS services can be used to build, train, and deploy custom machine learning models? (Choose two.)

Select 2 answers
A.Amazon Polly
B.Amazon Lex
C.AWS Deep Learning AMIs
D.Amazon Rekognition
E.Amazon SageMaker
AnswersC, E

Deep Learning AMIs provide a customizable environment for building and training models.

Why this answer

AWS Deep Learning AMIs (C) are pre-configured Amazon Machine Images that include popular deep learning frameworks (TensorFlow, PyTorch, MXNet) and GPU drivers, allowing you to build, train, and deploy custom ML models on EC2 instances. Amazon SageMaker (E) is a fully managed service that provides end-to-end capabilities for building, training, and deploying custom ML models at scale, with built-in algorithms, automatic model tuning, and one-click deployment.

Exam trap

The trap here is that candidates confuse pre-built AI services (Polly, Lex, Rekognition) with platforms that allow custom model development, leading them to select services that only consume pre-trained models rather than build and train custom ones.

50
Multi-Selectmedium

Which TWO techniques are commonly used to prevent overfitting in machine learning models? (Select TWO.)

Select 2 answers
A.Add more irrelevant features
B.Use cross-validation
C.Increase model complexity
D.Reduce the amount of training data
E.Use regularization
AnswersB, E

Cross-validation helps assess model generalization and can indicate overfitting.

Why this answer

Cross-validation helps prevent overfitting by partitioning the training data into multiple folds, training the model on different subsets, and validating on held-out portions. This provides a more robust estimate of model performance on unseen data and reduces the risk of memorizing noise in a single train-test split.

Exam trap

AWS often tests the misconception that adding more data or features always helps model performance, when in fact irrelevant features or reducing training data can worsen overfitting, and candidates may incorrectly associate 'more complexity' with better generalization.

51
MCQhard

An ML engineer wants to store training data in a format optimized for linear data scanning and columnar access in SageMaker. Which format is most appropriate?

A.JSON
B.Image (JPEG/PNG)
C.Parquet
D.CSV
AnswerC

Parquet is columnar and optimized for analytical queries.

Why this answer

Parquet is a columnar storage format optimized for both linear data scanning and columnar access, making it ideal for training data in SageMaker. It reduces I/O by storing data by columns rather than rows, enabling efficient retrieval of specific features during model training.

Exam trap

AWS often tests the misconception that CSV is the most efficient format for training data, but Parquet's columnar storage and compression provide superior performance for linear scanning and columnar access in distributed ML pipelines.

How to eliminate wrong answers

Option A is wrong because JSON is a row-oriented text format that requires full parsing for columnar access, leading to high I/O overhead and slower linear scans. Option B is wrong because image formats like JPEG/PNG are binary and designed for visual data, not structured tabular data, and lack columnar access capabilities. Option D is wrong because CSV is a row-oriented text format that, while simple, requires scanning entire rows to access specific columns and lacks compression and schema optimization.

52
MCQeasy

In a binary classification problem, the model predicts majority class for all inputs. What is this issue called?

A.High bias
B.Overfitting
C.High variance
D.Underfitting
AnswerA

Predicting majority class for all inputs indicates the model has high bias and is underfitting.

Why this answer

When a model predicts the majority class for all inputs, it indicates that the model is too simplistic and fails to capture the underlying patterns in the data. This is a classic symptom of high bias, where the model makes strong assumptions about the data distribution, leading to systematic underperformance on the minority class. In machine learning, high bias often results from an overly simple algorithm or insufficient model capacity, causing the model to underfit the training data.

Exam trap

Cisco often tests the distinction between 'high bias' and 'underfitting' as separate concepts, where underfitting is the symptom and high bias is the cause, so candidates may incorrectly select underfitting when the question explicitly asks for the name of the issue.

How to eliminate wrong answers

Option B (Overfitting) is wrong because overfitting occurs when the model learns noise and details from the training data too well, resulting in high variance and poor generalization, not a constant prediction of the majority class. Option C (High variance) is wrong because high variance typically leads to models that are overly sensitive to small fluctuations in the training data, producing different predictions for similar inputs, not a uniform majority class output. Option D (Underfitting) is a related concept but is not the specific term for the issue described; underfitting refers to the model's inability to capture the training data's patterns, which can cause high bias, but the question asks for the name of the issue itself, which is high bias.

53
MCQeasy

Which metric is most appropriate for evaluating a classification model when false positives are costly?

A.Precision
B.F1 score
C.Recall
D.Accuracy
AnswerA

Precision is the fraction of true positives among predicted positives, addressing false positives.

Why this answer

Precision is the most appropriate metric when false positives are costly because it measures the proportion of true positive predictions among all positive predictions (TP / (TP + FP)). A high precision indicates that when the model predicts a positive class, it is very likely correct, minimizing the number of false positives. This directly aligns with the business requirement to avoid costly false alarms.

Exam trap

Cisco often tests the distinction between precision and recall by framing a cost scenario, and the trap here is that candidates confuse 'costly false positives' with 'costly false negatives' and incorrectly choose recall or F1 score without analyzing which error type is being penalized.

How to eliminate wrong answers

Option B (F1 score) is wrong because it is the harmonic mean of precision and recall, balancing both false positives and false negatives; it does not specifically penalize false positives more heavily. Option C (Recall) is wrong because it measures the proportion of actual positives correctly identified (TP / (TP + FN)), which is useful when false negatives are costly, not false positives. Option D (Accuracy) is wrong because it considers overall correct predictions (TP + TN) divided by total predictions, which can be misleading in imbalanced datasets and does not isolate the cost of false positives.

54
MCQhard

Refer to the exhibit. A data scientist ran a training job on Amazon SageMaker and it failed. Which action should the data scientist take FIRST to resolve the issue?

A.Request a service limit increase for the instance type
B.Use a different AWS region
C.Enable spot training
D.Use a different instance type that is available in the region
AnswerD

The error clearly states the instance type is unsupported; switching to an available type resolves it.

Why this answer

Option D is correct because the error indicates that the requested instance type is not available in the current region due to capacity constraints. The first step is to switch to a different instance type that is available in the same region, as this is the quickest and most direct way to resolve the provisioning failure without requiring service limit increases or changing regions.

Exam trap

AWS often tests the distinction between capacity unavailability (which requires switching instance types) and service limits (which require a limit increase), leading candidates to mistakenly request a limit increase when the real issue is temporary capacity constraints.

How to eliminate wrong answers

Option A is wrong because a service limit increase addresses the maximum number of instances you can run, not the immediate unavailability of a specific instance type in the region. Option B is wrong because using a different AWS region is a more drastic step that may introduce latency, data residency issues, or additional costs; the first action should be to try an alternative instance type within the same region. Option C is wrong because enabling spot training does not resolve the unavailability of the instance type; spot instances still require available capacity for the requested instance type.

55
MCQeasy

A company wants to use AI to automatically transcribe customer service calls into text. Which AWS service is most suitable?

A.Amazon Transcribe
B.Amazon Comprehend
C.Amazon Polly
D.Amazon Rekognition
AnswerA

Transcribe is designed for speech-to-text conversion.

Why this answer

Amazon Transcribe is the correct choice because it is a fully managed automatic speech recognition (ASR) service designed specifically to convert speech into text. It can handle real-time streaming or batch processing of audio files, making it ideal for transcribing customer service calls into searchable text.

Exam trap

The trap here is that candidates often confuse Amazon Transcribe (speech-to-text) with Amazon Polly (text-to-speech) or assume Amazon Comprehend can process audio directly, when in fact Comprehend only works on text input.

How to eliminate wrong answers

Option B is wrong because Amazon Comprehend is a natural language processing (NLP) service used for extracting insights like sentiment, entities, and key phrases from text, not for transcribing audio. Option C is wrong because Amazon Polly is a text-to-speech (TTS) service that converts text into lifelike speech, the opposite of the required speech-to-text functionality. Option D is wrong because Amazon Rekognition is a computer vision service for analyzing images and videos, such as object detection and facial recognition, and has no capability to process audio or transcribe speech.

56
Multi-Selecthard

A company is using Amazon Fraud Detector to detect fraudulent transactions. Which TWO actions can be taken to improve model accuracy? (Select TWO.)

Select 2 answers
A.Increase the volume of event data
B.Deploy the model to multiple endpoints
C.Use a different detector type
D.Use a different model version
E.Select event variables that are more predictive
AnswersA, E

More data can help the model learn better patterns.

Why this answer

Increasing the volume of event data provides Amazon Fraud Detector with more examples of both fraudulent and legitimate transactions, which allows the model to learn more robust patterns and reduce overfitting. More data helps the model generalize better to unseen events, directly improving prediction accuracy.

Exam trap

Cisco often tests the misconception that changing model versions or detector types alone improves accuracy, when in reality accuracy improvements require data or feature enhancements.

57
MCQeasy

A company wants to automatically detect anomalies in their AWS CloudTrail logs to identify potential security threats. Which AWS service is specifically designed for this purpose?

A.Amazon Macie
B.AWS Config
C.Amazon GuardDuty
D.Amazon Inspector
AnswerC

GuardDuty uses ML to detect anomalies in CloudTrail logs and other sources.

Why this answer

Amazon GuardDuty is a threat detection service that continuously monitors AWS accounts and workloads using machine learning, anomaly detection, and integrated threat intelligence. It specifically analyzes CloudTrail management and data events, VPC Flow Logs, and DNS logs to identify unauthorized behavior or potential security threats, making it the correct choice for automatically detecting anomalies in CloudTrail logs.

Exam trap

Cisco often tests the distinction between services that detect threats (GuardDuty) versus services that protect data (Macie), assess vulnerabilities (Inspector), or track configuration compliance (Config), leading candidates to confuse their primary use cases.

How to eliminate wrong answers

Option A is wrong because Amazon Macie is a data security and data privacy service that uses machine learning to discover, classify, and protect sensitive data stored in Amazon S3, not to analyze CloudTrail logs for security threats. Option B is wrong because AWS Config is a service that evaluates and records resource configurations and compliance against desired policies, not designed for real-time anomaly detection in log data. Option D is wrong because Amazon Inspector is a vulnerability management service that scans EC2 instances and container images for software vulnerabilities and unintended network exposure, not for analyzing CloudTrail logs.

58
Multi-Selecteasy

Which TWO of the following are types of feature scaling?

Select 2 answers
A.One-hot encoding
B.Principal Component Analysis (PCA)
C.Standardization
D.Binning
E.Normalization (Min-Max)
AnswersC, E

Standardization (Z-score) is a common feature scaling method.

Why this answer

Standardization (Z-score scaling) transforms features to have a mean of 0 and a standard deviation of 1, making it a valid type of feature scaling. It is essential when using algorithms that assume normally distributed data, such as linear regression, SVM, or PCA, and it does not bound the data to a fixed range.

Exam trap

AWS often tests the distinction between feature scaling (changing the numeric range of features) and data transformation techniques like encoding or dimensionality reduction, leading candidates to confuse one-hot encoding or PCA with scaling methods.

59
MCQeasy

A team trained a deep learning model that achieves 99% accuracy on training data but only 70% on validation data. What is the most likely issue?

A.Underfitting
B.Overfitting
C.Data leakage
D.Feature scaling
AnswerB

Overfitting occurs when the model learns training data too well, including noise, failing to generalize to validation data.

Why this answer

The model performs exceptionally well on training data (99% accuracy) but significantly worse on validation data (70% accuracy). This large gap indicates the model has memorized the training data, including noise and irrelevant patterns, rather than learning generalizable features — a classic symptom of overfitting.

Exam trap

Cisco often tests the distinction between overfitting and underfitting by presenting a scenario where training accuracy is high but validation accuracy is low, tempting candidates to incorrectly choose underfitting if they focus only on the low validation score.

How to eliminate wrong answers

Option A is wrong because underfitting would show poor performance on both training and validation data, not high training accuracy with low validation accuracy. Option C is wrong because data leakage typically causes both training and validation accuracy to be artificially high, not a large gap between them. Option D is wrong because feature scaling issues would generally affect model convergence or performance uniformly across datasets, not create a specific training-validation accuracy disparity.

60
Multi-Selectmedium

Which THREE of the following are capabilities of Amazon SageMaker? (Select THREE.)

Select 3 answers
A.Real-time inference endpoints
B.Automatic model tuning (hyperparameter optimization)
C.On-premises training only
D.Built-in algorithms for common tasks
E.Can only deploy models to EC2 instances
AnswersA, B, D

SageMaker offers real-time inference with managed endpoints.

Why this answer

Amazon SageMaker provides real-time inference endpoints that allow you to deploy trained models to a fully managed HTTPS endpoint for low-latency predictions. These endpoints automatically scale based on traffic and support A/B testing, making them suitable for production workloads.

Exam trap

Cisco often tests the misconception that SageMaker is limited to cloud-only or specific deployment targets, but the service actually offers flexible deployment options including on-premises and edge devices.

61
MCQhard

A company is using Amazon SageMaker to train a large language model with hundreds of billions of parameters. The model does not fit into the memory of a single GPU. Which approach should they use to train the model efficiently?

A.Use a larger instance with more GPU memory, such as p4d.24xlarge
B.Use SageMaker's data parallelism strategy
C.Use SageMaker's model parallelism strategy with the SageMaker distributed training library
D.Reduce the model size by pruning layers until it fits into memory
AnswerC

Model parallelism splits the model across GPUs, enabling training of very large models.

Why this answer

Option C is correct because SageMaker's model parallelism strategy with the SageMaker distributed training library is specifically designed for training large models that do not fit into the memory of a single GPU. It partitions the model layers across multiple GPUs, enabling efficient training of models with hundreds of billions of parameters by overlapping computation and communication.

Exam trap

Cisco often tests the distinction between data parallelism and model parallelism, and the trap here is that candidates may confuse data parallelism (which splits data, not the model) as a solution for models that don't fit in memory, when in fact model parallelism is required for such cases.

How to eliminate wrong answers

Option A is wrong because even the largest GPU instances like p4d.24xlarge have limited GPU memory (40 GB per A100 GPU), which is insufficient for a model with hundreds of billions of parameters; scaling vertically is not feasible for such large models. Option B is wrong because SageMaker's data parallelism strategy replicates the entire model on each GPU and splits the data across GPUs, which requires the model to fit into a single GPU's memory; it does not solve the memory constraint issue. Option D is wrong because pruning layers to reduce model size would degrade model quality and is not a practical or efficient approach for training large language models; the goal is to train the full model, not a smaller version.

62
MCQhard

A SageMaker endpoint is configured with automatic scaling. The model's inference time is 50ms, and traffic increases gradually. What scaling metric should be used to add instances before latency increases?

A.Memory utilization
B.Concurrent requests
C.CPU utilization
D.Invocations per instance
AnswerD

Invocations per instance directly measures the load per instance, allowing proactive scaling before latency rises.

Why this answer

D is correct because 'Invocations per instance' is a custom metric that directly measures the number of inference requests each instance is handling. By setting a target value for this metric, the scaling policy can proactively add instances when the per-instance request count approaches a threshold, preventing latency increases before they occur. This is the recommended approach for SageMaker endpoints with gradual traffic increases, as it anticipates demand rather than reacting to latency spikes.

Exam trap

The trap here is that candidates often choose 'Concurrent requests' (Option B) thinking it directly measures load, but AWS SageMaker does not expose that metric for scaling; instead, 'Invocations per instance' is the correct metric that normalizes load per instance and enables proactive scaling.

How to eliminate wrong answers

Option A is wrong because memory utilization is not a reliable indicator of inference latency; SageMaker endpoints typically have sufficient memory, and scaling based on memory would not prevent latency from increasing due to request queuing. Option B is wrong because 'Concurrent requests' is not a supported metric for SageMaker automatic scaling; the correct metric is 'Invocations per instance' which normalizes request load across the number of instances. Option C is wrong because CPU utilization can spike due to other processes and does not directly correlate with inference latency; scaling on CPU may add instances too late or unnecessarily, as inference is often I/O-bound rather than CPU-bound.

63
Multi-Selectmedium

A company wants to use Amazon SageMaker Ground Truth to build a labeled dataset for a custom object detection model. Which TWO labeling strategies are available? (Choose two.)

Select 2 answers
A.Private workforce labeling (company employees)
B.Crowd-based labeling using Amazon Mechanical Turk
C.Automated labeling using pre-trained models
D.Active learning with manual verification
E.Fully automated labeling via AWS Lambda
AnswersA, B

Private workforce uses the company's own employees for labeling.

Why this answer

Amazon SageMaker Ground Truth supports private workforce labeling where company employees (e.g., via a corporate directory or invited users) perform manual annotation. This is ideal for sensitive data or domain-specific tasks like custom object detection, where internal expertise ensures high label accuracy.

Exam trap

Cisco often tests the distinction between labeling strategies (workforce types) and labeling features (like automated labeling or active learning), causing candidates to confuse automated data labeling as a workforce option when it is actually a post-labeling automation feature.

64
MCQeasy

A data scientist is preparing data for a machine learning model. What is the purpose of splitting the data into training, validation, and test sets?

A.To tune hyperparameters
B.To balance class distributions
C.To prevent overfitting during training
D.To evaluate model generalization
AnswerD

The test set provides an unbiased estimate of performance on new data.

Why this answer

The test set is used to evaluate the final model's generalization performance on unseen data. The validation set is for hyperparameter tuning during development. Training set is for fitting the model.

65
MCQeasy

A data scientist wants to quickly build a supervised learning model for binary classification on a tabular dataset with 10,000 rows and 200 features. The dataset has some missing values and requires minimal code. Which AWS service should the data scientist use?

A.Amazon SageMaker Studio Lab
B.Amazon SageMaker Clarify
C.Amazon SageMaker Autopilot
D.Amazon SageMaker JumpStart
AnswerC

Autopilot automates model building for tabular data.

Why this answer

Amazon SageMaker Autopilot is the correct choice because it automatically performs data preprocessing (including handling missing values), feature engineering, model selection, and hyperparameter tuning for supervised learning tasks like binary classification. It requires minimal code—users can simply point to a tabular dataset in Amazon S3 and specify the target column, and Autopilot will automatically train and evaluate multiple candidate models, making it ideal for quickly building a binary classifier on a 10,000-row, 200-feature dataset with missing values.

Exam trap

Cisco often tests the distinction between automated ML services (Autopilot) and model hosting or development environments (Studio Lab, JumpStart), so the trap here is that candidates may confuse SageMaker Autopilot with SageMaker JumpStart, thinking JumpStart also automates model building, when in fact JumpStart only provides pre-built models and requires manual configuration.

How to eliminate wrong answers

Option A is wrong because Amazon SageMaker Studio Lab is a free, no-code ML development environment that provides JupyterLab notebooks and limited compute resources, but it does not automate model building or handle missing values—it requires the user to write all code manually. Option B is wrong because Amazon SageMaker Clarify is designed for bias detection, model explainability, and fairness analysis, not for building or training supervised learning models; it cannot handle missing values or perform automated model selection. Option D is wrong because Amazon SageMaker JumpStart provides pre-built models and solutions for transfer learning and fine-tuning, but it does not automatically preprocess missing values or perform automated model selection for tabular binary classification—it requires the user to select and configure a model manually.

66
MCQmedium

A company wants to automatically detect anomalies in server metrics. Which algorithm is most appropriate?

A.XGBoost
B.One-class SVM
C.Linear SVM
D.K-Means
AnswerB

One-class SVM is commonly used for anomaly detection by learning a boundary around normal data.

Why this answer

One-class SVM is specifically designed for anomaly detection, as it learns a boundary around the normal data points in the feature space and identifies any point falling outside this boundary as an anomaly. This makes it ideal for detecting unusual patterns in server metrics without requiring labeled anomaly examples.

Exam trap

Cisco often tests the distinction between supervised and unsupervised learning, and the trap here is that candidates may choose XGBoost or Linear SVM because they are familiar with them for classification, forgetting that anomaly detection typically requires a one-class approach when only normal data is available.

How to eliminate wrong answers

Option A is wrong because XGBoost is a supervised ensemble learning algorithm used for classification and regression, not for unsupervised anomaly detection; it requires labeled training data and is not designed to identify outliers without prior examples. Option C is wrong because Linear SVM is a supervised binary classifier that separates data into two classes using a hyperplane, and it cannot perform one-class anomaly detection without negative samples. Option D is wrong because K-Means is an unsupervised clustering algorithm that partitions data into clusters based on distance, but it does not inherently detect anomalies; while outliers can be inferred from cluster distances, it is not a dedicated anomaly detection method and lacks the statistical boundary learning of one-class SVM.

67
MCQmedium

A company is using Amazon Rekognition to detect objects in images. They find that the service sometimes mislabels objects. What is the best way to improve accuracy for their specific use case?

A.Use a larger image size
B.Contact AWS support
C.Increase the confidence threshold
D.Use Amazon SageMaker to build a custom model
AnswerD

A custom model trained on domain-specific data can significantly improve accuracy.

Why this answer

Amazon Rekognition is a pre-trained service that may not perform optimally for specialized or domain-specific use cases. By using Amazon SageMaker to build a custom model, you can train a model on your own labeled dataset, which directly addresses the mislabeling issue by tailoring the model to your specific images and objects.

Exam trap

The trap here is that candidates often assume increasing the confidence threshold is a universal fix for accuracy issues, but the AIF-C01 exam tests the understanding that pre-trained services have limitations and that custom training (via SageMaker) is required for domain-specific improvements.

How to eliminate wrong answers

Option A is wrong because using a larger image size does not inherently improve Rekognition's detection accuracy; the service already resizes images to a standard input size, and larger images may only increase processing time without correcting mislabeling. Option B is wrong because contacting AWS support will not modify the underlying pre-trained model or improve its accuracy for your specific use case; support can only assist with service configuration or bugs, not model retraining. Option C is wrong because increasing the confidence threshold reduces false positives but does not fix systematic mislabeling; it may cause the service to return fewer results, potentially missing correct detections, without addressing the root cause of incorrect object identification.

68
MCQmedium

A data science team is using Amazon SageMaker to train multiple models with different hyperparameters. They want to track metrics, compare runs, and reproduce the best result. Which SageMaker feature should they use?

A.SageMaker Model Registry
B.SageMaker Debugger
C.SageMaker Autopilot
D.SageMaker Experiments
AnswerD

Experiments provides a framework for tracking and comparing multiple training runs.

Why this answer

SageMaker Experiments is the correct feature because it is specifically designed to track, organize, and compare machine learning training runs (trials) with different hyperparameters and metrics. It allows data scientists to log parameters, metrics, and artifacts for each run, compare results across runs, and retrieve the exact configuration needed to reproduce the best-performing model.

Exam trap

The trap here is that candidates often confuse SageMaker Experiments with SageMaker Model Registry, mistakenly thinking that model versioning and run tracking are the same feature, when in fact Experiments focuses on the iterative training process and Registry focuses on the final model lifecycle.

How to eliminate wrong answers

Option A is wrong because SageMaker Model Registry is a catalog for managing and versioning trained models, not for tracking and comparing individual training runs or hyperparameter experiments. Option B is wrong because SageMaker Debugger monitors training jobs in real time for issues like vanishing gradients or overfitting, but it does not provide a structured way to log, compare, or reproduce runs with different hyperparameters. Option C is wrong because SageMaker Autopilot automatically explores different algorithms and hyperparameters to find the best model, but it does not give the team the ability to manually track, compare, and reproduce their own custom runs with specific hyperparameters.

69
MCQhard

A manufacturing company is deploying IoT sensors to monitor equipment performance. The sensors generate continuous unlabeled time-series data with thousands of dimensions. The goal is to detect anomalies indicating potential failures in real time. The data science team has experience with unsupervised learning and wants to use a SageMaker built-in algorithm that can handle high-dimensional data and identify outliers. They also need to reduce the number of dimensions to improve training speed without losing important information. Which approach should they take?

A.Use Amazon SageMaker Linear Learner algorithm
B.Use Amazon SageMaker Random Cut Forest algorithm
C.Use Amazon SageMaker Image Classification algorithm
D.Use Amazon SageMaker Object Detection algorithm
AnswerB

Random Cut Forest is an unsupervised anomaly detection algorithm suited for high-dimensional data.

Why this answer

Amazon SageMaker Random Cut Forest (RCF) is a built-in unsupervised algorithm specifically designed for anomaly detection on high-dimensional time-series data. It works by constructing an ensemble of random trees to isolate outliers, making it ideal for the unlabeled, continuous sensor data described. Additionally, RCF inherently handles high-dimensional data without requiring explicit dimensionality reduction, as it randomly samples features at each split, effectively reducing the impact of irrelevant dimensions while preserving anomaly detection accuracy.

Exam trap

The trap here is that candidates may confuse Random Cut Forest with a dimensionality reduction technique like PCA, but RCF does not reduce dimensions—it randomly samples features per tree to handle high-dimensional data without explicit reduction, while still identifying outliers effectively.

How to eliminate wrong answers

Option A is wrong because Amazon SageMaker Linear Learner is a supervised algorithm used for regression or binary classification, requiring labeled data, and it does not natively perform anomaly detection or dimensionality reduction. Option C is wrong because Amazon SageMaker Image Classification is a supervised algorithm designed for classifying images, not for unsupervised anomaly detection on time-series data. Option D is wrong because Amazon SageMaker Object Detection is a supervised algorithm for identifying and localizing objects within images, which is irrelevant to unlabeled time-series sensor data.

70
MCQmedium

A company is training a deep learning model on Amazon SageMaker using a large dataset stored in S3. Training jobs are frequently failing with 'OutOfMemoryError'. The training algorithm uses PyTorch. How should the data scientist solve this without reducing model accuracy?

A.Use SageMaker Pipe mode for data ingestion
B.Reduce the number of layers in the model
C.Increase the batch size
D.Use a smaller instance type with less memory
AnswerA

Pipe mode streams data directly, reducing memory footprint and preventing OutOfMemoryError.

Why this answer

SageMaker Pipe mode streams training data directly from S3 into the algorithm without first downloading it to the local disk, which drastically reduces memory consumption. This allows the model to handle large datasets that would otherwise cause an OutOfMemoryError when using the default File mode, all while preserving the original model architecture and accuracy.

Exam trap

Cisco often tests the misconception that reducing model complexity or instance size is the only way to fix memory errors, when in fact data ingestion mode changes (like Pipe mode) can resolve the issue without sacrificing accuracy or performance.

How to eliminate wrong answers

Option B is wrong because reducing the number of layers in the model would decrease model capacity and likely reduce accuracy, which violates the requirement to not reduce model accuracy. Option C is wrong because increasing the batch size would increase memory usage per training step, exacerbating the OutOfMemoryError rather than solving it. Option D is wrong because using a smaller instance type with less memory would make the memory problem worse, not better, and would likely lead to even more frequent failures.

71
MCQhard

During a SageMaker training job, the data scientist observes that the loss is not decreasing after the initial few epochs. The model is a deep neural network with ReLU activations. Which hyperparameter adjustment is most likely to help?

A.Reduce the learning rate
B.Increase the number of epochs
C.Increase the learning rate
D.Decrease the batch size
AnswerA

A lower learning rate can allow the optimizer to find a better minimum.

Why this answer

When loss plateaus after a few epochs with ReLU activations, the model is likely stuck in a region where gradients are small (e.g., near a local minimum or plateau). Reducing the learning rate allows the optimizer to take smaller steps, which can help it navigate out of flat regions and continue decreasing the loss. This is a standard technique to improve convergence when training stalls.

Exam trap

Cisco often tests the misconception that increasing the learning rate will accelerate convergence, but in plateau scenarios it actually causes divergence or oscillation, making the reduction of learning rate the correct adjustment.

How to eliminate wrong answers

Option B is wrong because increasing the number of epochs does not address the underlying issue of the optimizer being unable to escape a plateau; it would simply continue training with no improvement. Option C is wrong because increasing the learning rate would likely cause the optimizer to overshoot the minimum or oscillate, potentially worsening the loss plateau. Option D is wrong because decreasing the batch size introduces more noise into gradient estimates, which can destabilize training and does not directly help when the loss is stuck on a plateau.

72
Multi-Selectmedium

A data scientist is preparing data for a classification task. Which TWO techniques are commonly used for handling missing values? (Choose two.)

Select 2 answers
A.Label encoding
B.Normalization
C.Imputing with mean
D.Dropping rows with any missing values
E.One-hot encoding
AnswersC, D

Mean imputation replaces missing values with the mean of the column.

Why this answer

Imputing with the mean is a common technique for handling missing values in numerical features because it preserves the overall distribution of the data without reducing the dataset size. This method replaces each missing entry with the arithmetic mean of the non-missing values in that column, which is simple to implement and works well when data is missing completely at random (MCAR).

Exam trap

Cisco often tests the distinction between data preprocessing techniques (e.g., encoding, scaling) and missing value handling, so candidates mistakenly select label encoding or normalization because they are common preprocessing steps, even though they do not address missing data.

73
MCQmedium

Refer to the exhibit. A SageMaker training job fails with an 'AccessDenied' error when trying to read files from the S3 bucket 'my-training-data'. The IAM role used by the training job has the policy shown. What is the most likely reason for the failure?

A.The bucket policy requires encryption in transit
B.The training job is using the wrong AWS region
C.The policy does not include the s3:PutObject action
D.The policy does not include the s3:ListBucket action
AnswerD

Without ListBucket, SageMaker cannot list the contents of the bucket to verify object existence.

Why this answer

The IAM policy grants s3:GetObject but not s3:ListBucket. When a SageMaker training job reads files from S3, the SageMaker SDK or framework (e.g., TensorFlow, PyTorch) often performs a ListBucket call first to enumerate objects in the prefix. Without s3:ListBucket, the SDK cannot discover the files, resulting in an AccessDenied error even though GetObject is allowed.

Exam trap

AWS often tests the misconception that only s3:GetObject is needed for reading from S3, but the SDK's underlying ListBucket call is required for object discovery, especially when using prefixes or manifest files.

How to eliminate wrong answers

Option A is wrong because the error is an IAM AccessDenied, not a bucket policy condition failure; encryption in transit would cause a different error (e.g., 'The request was denied because of a condition in the bucket policy'). Option B is wrong because the training job and S3 bucket must be in the same region for SageMaker to access the data, but the error message would be 'BucketRegionError' or a timeout, not AccessDenied. Option C is wrong because s3:PutObject is not needed for reading files; the training job only requires read permissions (GetObject and ListBucket) to fetch training data.

74
MCQmedium

A data scientist is using Amazon SageMaker to train a model. The training job is taking longer than expected. Which change would most likely reduce training time?

A.Increase the number of training epochs
B.Use a larger batch size
C.Use a smaller instance type
D.Enable spot training
AnswerB

A larger batch size processes more samples per iteration, reducing the number of steps and overall time, provided the hardware supports it.

Why this answer

Using a larger batch size allows the model to process more training samples per iteration, which reduces the number of weight updates needed per epoch and can improve hardware utilization (e.g., GPU parallelism). This often leads to faster training times, provided the batch size fits within memory constraints and does not degrade model convergence.

Exam trap

Cisco often tests the misconception that reducing instance size or enabling spot instances directly improves training speed, when in fact these changes primarily affect cost or resource availability, not performance.

How to eliminate wrong answers

Option A is wrong because increasing the number of training epochs increases the total number of passes over the data, which would lengthen training time, not reduce it. Option C is wrong because using a smaller instance type reduces compute capacity (e.g., fewer vCPUs, less memory), which typically slows down training rather than speeding it up. Option D is wrong because enabling spot training (using Amazon EC2 Spot Instances) reduces cost but does not inherently reduce training time; it may even cause interruptions that delay completion.

75
MCQmedium

A company uses Amazon SageMaker to train a model. The training job fails with 'InsufficientInstanceCapacity' error. What is the most likely cause?

A.The request rate is too high.
B.The dataset size exceeds the instance storage limit.
C.The requested instance type is not available in the specified region.
D.The training image is not compatible with the instance type.
AnswerC

This error occurs when AWS cannot provision the instance due to capacity constraints.

Why this answer

The 'InsufficientInstanceCapacity' error in Amazon SageMaker indicates that AWS does not currently have enough available capacity for the requested instance type in the specified region or Availability Zone. This is a common transient error when demand for a particular instance type exceeds supply, and it is not related to request rate, dataset size, or image compatibility.

Exam trap

Cisco often tests the distinction between capacity errors and throttling errors, so the trap here is confusing 'InsufficientInstanceCapacity' with a rate-limiting or quota error, leading candidates to incorrectly select Option A.

How to eliminate wrong answers

Option A is wrong because 'InsufficientInstanceCapacity' is a capacity error, not a throttling error; throttling (e.g., from high request rate) would return a 'ThrottlingException' or 'RequestLimitExceeded' error. Option B is wrong because dataset size exceeding instance storage limits would cause an 'OutOfMemory' or 'DiskFull' error, not a capacity error. Option D is wrong because image compatibility issues would result in a 'ClientError' or 'ImageNotFoundException', not an instance capacity error.

Page 1 of 2 · 97 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Fundamentals of AI and ML questions.