PDE Operationalizing machine learning models — All Questions With Answers

Question 1mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company deploys a machine learning model to Vertex AI for real-time predictions. After deployment, they notice that prediction latency spikes during peak traffic hours. Which approach should they take to reduce latency without sacrificing accuracy?

Question 2hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A data science team uses Vertex AI Pipelines to automate retraining. They want to ensure that only models with performance above a threshold are deployed. Which component should they add to the pipeline?

Question 3easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company trains a custom model using TensorFlow and wants to deploy it to Vertex AI for low-latency predictions. The model is large (2 GB). Which deployment option should they choose?

Question 4mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company uses Vertex AI to serve a model. They notice that some predictions are incorrect due to data drift. What is the best way to detect and retrain the model automatically?

Question 5hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A financial services company needs to explain predictions from a complex ensemble model for regulatory compliance. Which Vertex AI service should they use?

Question 6easymultiple choice

Read the full Operationalizing machine learning models explanation →

A team wants to retrain a model weekly using new data stored in BigQuery. They want to minimize manual effort. Which approach should they use?

Question 7mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company deploys a model to Vertex AI Endpoint. They want to run a canary deployment to test a new model version with 10% of traffic. How should they configure this?

Question 8hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist uses Vertex AI Workbench notebooks for model development. They want to share the environment with team members while maintaining version control. Which approach should they use?

Question 9easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company wants to monitor the performance of a deployed model in production. Which metric indicates that the model's predictions are degrading?

Question 10mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A team uses Vertex AI AutoML Tables to train a model. They need to deploy the model for real-time predictions with high availability. Which deployment configuration should they use?

Question 11hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company uses Vertex AI to serve a model that requires GPU for inference. They want to minimize cost while handling variable traffic. Which strategy should they use?

Question 12mediummulti select

Read the full Operationalizing machine learning models explanation →

Which TWO steps are required to deploy a custom scikit-learn model to Vertex AI for online predictions?

Question 13hardmulti select

Read the full Operationalizing machine learning models explanation →

Which THREE factors should be considered when designing a Vertex AI Pipeline for continuous training?

Question 14easymulti select

Read the full Operationalizing machine learning models explanation →

Which TWO actions can help reduce prediction latency for a Vertex AI endpoint?

Question 15mediummulti select

Read the full Operationalizing machine learning models explanation →

Which THREE metrics should be monitored for a deployed machine learning model in production?

Question 16hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company has a production machine learning model deployed on Vertex AI Endpoint that predicts customer churn. The model is retrained weekly using a Vertex AI Pipeline that pulls new data from BigQuery. Recently, the model's accuracy has been declining. The data science team suspects data drift but is unsure. They have enabled Vertex AI Model Monitoring but have not set up any alerts. The team wants to diagnose and address the issue quickly. The pipeline runs successfully, and no errors are reported. The model endpoint is serving predictions with average latency of 200ms. What should the team do first?

Question 17mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A retail company uses a Vertex AI endpoint to serve product recommendations. The model is a TensorFlow model deployed with a custom container. Recently, users have reported that recommendations are stale. The model is retrained daily using Vertex AI Pipelines. The pipeline completes successfully, but the endpoint continues to serve the old model. The team checks the pipeline logs and sees that the new model is uploaded to the Vertex AI Model Registry. The endpoint has traffic split set to 100% for the old model. The team needs to update the endpoint to serve the new model version. What should they do?

Question 18mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company has deployed a machine learning model on Vertex AI Prediction that serves real-time predictions for a customer-facing application. The model was trained using a custom container and is hosted on a single endpoint with a minimum number of nodes. Recently, the team noticed that during peak traffic, prediction latency increases significantly and some requests time out. The endpoint is configured with a baseline traffic split of 100% on the current model version. Which action should the team take to reduce latency and improve reliability?

Question 19hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A data science team is operationalizing a batch prediction job using Vertex AI Batch Prediction. The model uses a custom container that requires a specific GPU for inference. The job processes a large dataset stored in Cloud Storage. The team wants to minimize cost while ensuring the job completes within a 2-hour window. Which configuration should they choose?

Question 20easymulti select

Read the full Operationalizing machine learning models explanation →

A company is deploying a machine learning model for fraud detection. The model is trained using TensorFlow and will be served on Vertex AI Prediction. The team wants to implement model monitoring to detect prediction drift. Which TWO actions should they take? (Choose 2)

Question 21mediummultiple choice

Read the full NAT/PAT explanation →

A data scientist deploys a new version of a fraud detection model (model2) alongside the existing model (model1) on the same Vertex AI endpoint with a 70/30 traffic split. After 24 hours, the team notices that model2's predictions are significantly different from model1's, and the fraud detection rate has increased. What is the most likely explanation for the change in predictions?

Exhibit

Refer to the exhibit.

```
$ gcloud ai endpoints describe my-endpoint
...
trafficSplit:
  model1: 70
  model2: 30
...
$ gcloud ai models describe model1
...
containerSpec:
  imageUri: us-central1-docker.pkg.dev/my-project/my-repo/model1:v1
  env:
  - name: MODEL_NAME
    value: fraud_detection_v1
...
$ gcloud ai models describe model2
...
containerSpec:
  imageUri: us-central1-docker.pkg.dev/my-project/my-repo/model2:v1
  env:
  - name: MODEL_NAME
    value: fraud_detection_v2
...
```

Question 22hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You are a machine learning engineer at a FinTech company. Your team has developed a credit risk model using XGBoost and deployed it on Vertex AI Prediction using a custom container. The model is used for real-time credit decisions, and the endpoint is configured with a single machine type (n1-standard-4) and min_replica_count = 2, max_replica_count = 10. Recently, the team observed that during a promotional campaign, the endpoint's prediction latency increased from 200ms to over 2 seconds, and some requests resulted in 503 errors. You check the Cloud Monitoring metrics and see that CPU utilization reached 100% on the existing replicas, but the number of replicas never scaled beyond the initial 2. The deployment uses a custom container that runs a TensorFlow Serving-like model server. The container image is stored in Artifact Registry. The Vertex AI endpoint is configured with a traffic split of 100% to this model version. What is the most likely cause of the scaling failure, and what step should you take to resolve it?

Question 23hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You have deployed a TensorFlow model on Vertex AI Endpoints with autoscaling. The model receives high traffic during peak hours, but you notice that inference latency increases significantly during cold starts. Which strategy would best minimize cold-start latency without incurring unnecessary cost?

Question 24mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Your team is using Vertex AI Pipelines to orchestrate a model retraining workflow. The pipeline includes a data validation step, a training step, and a model evaluation step. You want to ensure that if the evaluation step fails due to low model performance, the pipeline stops and does not deploy the model. Which approach should you use?

Question 25easymultiple choice

Read the full Operationalizing machine learning models explanation →

You are using AI Platform Prediction (now Vertex AI) for online predictions. You notice that some requests are failing with a 503 status code. Which is the most likely cause?

Question 26mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A retail company uses a machine learning model to predict inventory demand. The model is retrained weekly using Vertex AI Pipelines. Recently, the model's accuracy has degraded because the data distribution has shifted. Which action should you take to monitor and detect this drift automatically?

Question 27hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You are responsible for deploying a PyTorch model for real-time inference. The model requires GPU acceleration. You want to minimize infrastructure management overhead. Which serving option should you choose?

Question 28easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data science team has built a model using scikit-learn. They want to operationalize it on Google Cloud without rewriting the code. Which approach should they take?

Question 29mediummultiple choice

Read the full Operationalizing machine learning models explanation →

You have a batch prediction job on Vertex AI that processes millions of records. The job is failing with an out-of-memory error. What is the best way to resolve this?

Question 30hardmultiple choice

Study the full Python automation breakdown →

Your MLOps pipeline uses Vertex AI Pipelines. You want to ensure that model training uses a consistent environment with specific Python package versions. Which approach best achieves this?

Question 31mediummulti select

Read the full Operationalizing machine learning models explanation →

Which TWO are best practices for monitoring a deployed machine learning model in production on Vertex AI?

Question 32hardmulti select

Read the full Operationalizing machine learning models explanation →

Which THREE considerations are important when designing a batch prediction pipeline for a large dataset on Vertex AI?

Question 33mediummulti select

Read the full Operationalizing machine learning models explanation →

Which TWO actions can help reduce the latency of a Vertex AI endpoint serving a large neural network model?

Question 34mediummultiple choice

Read the full Operationalizing machine learning models explanation →

You configured a model deployment monitor on your Vertex AI endpoint as shown. What will happen when the feature 'age' has a skew of 0.4?

Exhibit

Refer to the exhibit.

```
$ gcloud ai endpoints describe my-endpoint
...
modelDeploymentMonitors:
- model: projects/my-project/models/my-model
  objectiveConfig:
    objectiveType: skew
    skewConfig:
      featureSkewThresholds:
        age: 0.3
        income: 0.2
  alertConfig:
    enableAlerting: true
    alertEmailAddresses:
    - admin@example.com
```

Question 35hardmultiple choice

Read the full Operationalizing machine learning models explanation →

In the Vertex AI Pipeline component YAML exhibit, the component is designed to evaluate a model and produce metrics. If the threshold_accuracy is set to 0.85, what is the expected behavior of this component?

Exhibit

Refer to the exhibit.

```
# Vertex AI Pipeline component YAML
name: model-evaluation
inputs:
  model_path:
    type: String
  test_data_path:
    type: String
  threshold_accuracy:
    type: Float
    default: 0.85
outputs:
  evaluation_metrics:
    type: Metrics
implementation:
  container:
    image: gcr.io/my-project/eval:latest
    args: [
      --model_path, {inputValue: model_path},
      --test_data_path, {inputValue: test_data_path},
      --threshold_accuracy, {inputValue: threshold_accuracy},
      --output_path, {outputPath: evaluation_metrics}
    ]
```

Question 36hardmultiple choice

Read the full NAT/PAT explanation →

You are a data engineer at a financial services company. You have deployed a credit risk model on Vertex AI Endpoints using a custom container with a TensorFlow SavedModel. The model expects input features as a JSON object. Recently, the model has been returning high prediction latency and occasional 503 errors. You have enabled autoscaling with minNodes=2 and maxNodes=10. The model is CPU-only and uses n1-standard-4 machines. Monitoring shows that during peak hours, CPU utilization reaches 90% and memory is at 80%. The number of prediction requests per second peaks at 100. You suspect that the model is not scaling fast enough. Which action will most effectively reduce latency and eliminate 503 errors?

Question 37mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Your company uses Vertex AI Pipelines to automate model retraining. The pipeline has three steps: data extraction from BigQuery, feature engineering using Dataflow, and model training using a custom container on Vertex AI Training. Recently, the pipeline has been failing intermittently at the Dataflow step with a 'The job encountered a transient error. Please retry.' message. You have enabled pipeline retries with 3 attempts. However, the pipeline still fails after 3 retries. You check the logs and find that the Dataflow job requires more resources than the default worker configuration provides. Which change should you make to reduce the failure rate?

Question 38mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A financial services company deploys a regression model to predict loan default risk. The model is served using Vertex AI Endpoints with autoscaling. After deployment, latency increases significantly during peak hours, causing timeouts. The model uses scikit-learn and has a large feature set. Which action should the team take to reduce latency while maintaining prediction accuracy?

Question 39hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A data science team deploys a TensorFlow image classification model to Vertex AI Prediction. The model performs well in offline evaluation but shows a 15% drop in accuracy in production. The production data distribution has shifted compared to the training data. The team needs to continuously monitor and retrain the model. Which solution is most appropriate for detecting drift and triggering retraining?

Question 40easymulti select

Read the full Operationalizing machine learning models explanation →

A data engineering team is operationalizing a machine learning model for real-time fraud detection. The model must process transactions with sub-100ms latency and be highly available. Which TWO strategies should the team implement?

Question 41mediummultiple choice

Read the full Operationalizing machine learning models explanation →

What is the most likely cause of this error?

Exhibit

Refer to the exhibit.

```yaml
# config.yaml for Vertex AI Batch Prediction
project: my-project
modelId: '123456789'
instancesFormat: jsonl
predictionsFormat: jsonl
outputBigQueryTable: my-dataset.predictions
machineType: n1-standard-4
batchSize: 64
maxWorkerCount: 10
```

The batch prediction job completed with the following error: "INVALID_ARGUMENT: The model expects input of shape (-1, 224, 224, 3) but received input of shape (1, 224, 224, 3)."

Question 42hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare startup is deploying a natural language processing (NLP) model for extracting medical entities from clinical notes. The model is a fine-tuned BERT model served on Vertex AI Prediction using a custom container. The team observes that prediction latency is around 500ms per request, but they need to handle up to 100 requests per second (QPS) with end-to-end latency under 200ms. The model currently runs on n1-standard-4 machines (4 vCPU, 15 GB memory). During load testing, CPU utilization reaches 90% and memory usage is 12 GB. The team is considering options to meet the requirements. Which action should they take?

Question 43mediumdrag order

Read the full Operationalizing machine learning models explanation →

Drag and drop the steps to deploy a Cloud Dataflow pipeline from a template into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 44mediumdrag order

Read the full Operationalizing machine learning models explanation →

Drag and drop the steps to create a Cloud Bigtable instance and table using the CLI into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 45mediummatching

Read the full Operationalizing machine learning models explanation →

Match each Google Cloud IAM role to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Read access to BigQuery datasets and tables

Permission to run BigQuery jobs

Read access to Cloud Storage objects

Permissions for Dataflow worker nodes

Question 46mediummatching

Read the full Operationalizing machine learning models explanation →

Match each BigQuery feature to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Sorting data within partitions to improve query performance

Dividing tables into segments based on a date/timestamp column

Unit of computational capacity in BigQuery

Pre-computed query results for faster access

Question 47easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist has trained an XGBoost model on Vertex AI and wants to deploy it to an endpoint with automatic scaling based on traffic. What is the recommended deployment approach?

Question 48mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A retail company is using a machine learning model for inventory forecasting. They observe that the model's predictions become less accurate over time, especially during holiday seasons. Which monitoring metric should they prioritize?

Question 49hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A financial institution needs to deploy a TensorFlow model for fraud detection with strict latency requirements (<100ms). The model uses custom ops that are not available in standard TF Serving. What is the most appropriate serving solution?

Question 50easymultiple choice

Read the full Operationalizing machine learning models explanation →

A team is using Kubeflow Pipelines on Google Kubernetes Engine to orchestrate ML workflows. They need to track parameters, metrics, and artifacts for each run. Which tool should they integrate?

Question 51mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company has a trained model stored in Vertex AI Model Registry. They want to automate retraining when new training data arrives in Cloud Storage. Which approach is most efficient?

Question 52hardmultiple choice

Read the full Operationalizing machine learning models explanation →

An e-commerce company deploys a recommendation model on Vertex AI Endpoints. The endpoint receives a high volume of requests with a large payload. They notice high latency and occasional timeouts. Which action should they take to improve performance without sacrificing accuracy?

Question 53easymultiple choice

Read the full Operationalizing machine learning models explanation →

A startup is deploying a PyTorch model on Google Cloud. They need to serve predictions for a mobile app with bursty traffic. Which service is most cost-effective?

Question 54mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist is using Vertex AI to train a model and wants to ensure that the training code and environment are reproducible. Which approach should they take?

Question 55hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare organization is deploying a model that processes protected health information (PHI). They need to ensure that the inference data is encrypted in transit and at rest, and access is audited. Which combination of services meets these requirements?

Question 56mediummulti select

Read the full Operationalizing machine learning models explanation →

A team is debugging a sudden increase in prediction latency for a model deployed on Vertex AI Endpoints. Which TWO metrics in Cloud Monitoring should they examine first? (Choose two.)

Question 57hardmulti select

Read the full Operationalizing machine learning models explanation →

A company is migrating ML workflows to Vertex AI Pipelines. They want to ensure best practices for pipeline reproducibility and debugging. Which THREE actions should they take? (Choose three.)

Question 58easymulti select

Read the full Operationalizing machine learning models explanation →

A data engineering team is operationalizing a machine learning model for real-time inference. They need to monitor the model's performance in production. Which THREE types of monitoring should they implement? (Choose three.)

Question 59mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. An ML engineer sees this error when invoking a Vertex AI endpoint. What is the most likely cause?

Exhibit

Refer to the exhibit.
{
  "error": {
    "code": 400,
    "message": "Prediction failed: Exception during run: Input tensor shape mismatch. Expected: [1, 128, 128, 3]. Got: [1, 256, 256, 3] in model 'resnet50'.",
    "status": "INVALID_ARGUMENT"
  }
}

Question 60hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. A data scientist notices that the evaluation component rarely passes the threshold, causing the pipeline to fail often. What should they do to improve efficiency?

Network Topology

Question 61easymultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. A Cloud Build step fails when pushing a Docker image to Artifact Registry. What is the missing IAM role for the Cloud Build service account?

Exhibit

Refer to the exhibit.
```json
{
  "error": "denied: permission denied for us-central1-docker.pkg.dev/my-project/my-repo/my-model:latest"
}
```

Question 62mediummultiple choice

Read the full Operationalizing machine learning models explanation →

You have deployed a classification model on Vertex AI Endpoints. The model's training data had a balanced class distribution, but over time, the production data has shifted such that one class appears 90% of the time. The model's overall accuracy remains high, but the recall for the minority class has dropped significantly. What is the best approach to detect and address this issue?

Question 63easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data science team has trained a TensorFlow model for image classification and wants to deploy it to production with minimal latency. They have already exported the model as a SavedModel directory. Which service should they use to create an online prediction endpoint?

Question 64hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Your Vertex AI model deployed on an endpoint is experiencing high tail latency during online predictions. The model uses a large embedding layer, and the input size varies. You have enabled automatic scaling with a minimum of 2 replicas and maximum of 10. What is the most likely cause of the latency spikes and the best first step to diagnose?

Question 65mediummultiple choice

Read the full Operationalizing machine learning models explanation →

You run batch predictions using Vertex AI Batch Prediction on a tabular dataset. The job processes 1 million rows and takes 6 hours to complete. You need to reduce the processing time to under 2 hours without increasing cost significantly. What should you do?

Question 66easymultiple choice

Read the full Operationalizing machine learning models explanation →

Your team uses Vertex AI Feature Store to serve features for online predictions. A feature value changes frequently (e.g., user session clicks). Which type of feature should you use to ensure low-latency writes and reads?

Question 67hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You have two versions of a classification model (v1 and v2) deployed on a Vertex AI Endpoint. You want to gradually roll out v2 to 10% of traffic, monitor performance, and if metrics are better, increase traffic to 100%. You have set up model monitoring for skew and drift. Which configuration should you use?

Question 68mediummultiple choice

Read the full Operationalizing machine learning models explanation →

You need to automate retraining of a model when new training data becomes available every week. The training pipeline runs on Vertex AI Pipelines and is triggered by Cloud Composer. After retraining, you want to evaluate the new model against a golden dataset. If the model's accuracy improves by at least 1%, it should be automatically deployed to the staging endpoint. What is the best way to implement the decision logic?

Question 69easymultiple choice

Read the full Operationalizing machine learning models explanation →

Your team wants to continuously monitor a deployed model's performance in production. They need to detect when the model's predictions become unreliable due to changes in the real world (e.g., new customer behavior). Which Vertex AI service should they use?

Question 70hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A model deployed on Vertex AI Endpoint is making predictions with high accuracy but the business team suspects bias against a certain demographic group. You need to analyze the model's predictions for fairness. What is the most effective approach?

Question 71mediummulti select

Read the full Operationalizing machine learning models explanation →

Which TWO best practices should be followed when managing multiple model versions on Vertex AI Endpoints for a production system?

Question 72hardmulti select

Read the full Operationalizing machine learning models explanation →

Which TWO metrics are most important to monitor for a real-time online prediction system to ensure service reliability and model performance?

Question 73mediummulti select

Read the full Operationalizing machine learning models explanation →

Which THREE components are typically part of a Vertex AI Pipeline for automated model retraining and deployment?

Question 74hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You run `gcloud ai models describe` and get the error above. The model was created successfully from a training job that completed without errors. The model ID is correct. What is the most likely cause?

Exhibit

Refer to the exhibit.

$ gcloud ai models describe --region=us-central1 123456789

Error: (gcloud.ai.models.describe) INVALID_ARGUMENT: The specified model ID '123456789' does not exist in project 'my-project' in region 'us-central1'. Model must be created before describing.

The model was created using the API and the response indicated success.

Question 75mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A user named Charlie needs to deploy a model to a Vertex AI Endpoint and also create training jobs. Which role should be assigned to Charlie?

Exhibit

Refer to the exhibit.

{
  "policy": {
    "bindings": [
      {
        "role": "roles/aiplatform.user",
        "members": [
          "user:alice@example.com"
        ]
      },
      {
        "role": "roles/aiplatform.modelUser",
        "members": [
          "user:bob@example.com"
        ]
      }
    ]
  }
}

This IAM policy is applied at the project level. Alice can create models but cannot get predictions from existing models. Bob can only get predictions but cannot create new models.

Question 76easymultiple choice

Read the full Operationalizing machine learning models explanation →

A user gets the above error when trying to get online predictions. The model was created and the endpoint exists. What is the most likely reason?

Exhibit

Refer to the exhibit.

Error log from Cloud Logging:

{
  "textPayload": "Prediction failed: Model 'projects/my-project/locations/us-central1/models/12345' is not deployed to endpoint 'projects/my-project/locations/us-central1/endpoints/67890'. Ensure the model is deployed to the endpoint before making predictions.",
  "timestamp": "2024-03-15T10:30:00Z",
  "resource": {
    "type": "aiplatform.googleapis.com/Endpoint"
  }
}

Question 77mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company has deployed a machine learning model to AI Platform Prediction. The model uses a custom container with a TensorFlow SavedModel. After deployment, the prediction latency is higher than expected. Which action is most likely to reduce latency without significantly impacting model accuracy?

Question 78easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist wants to automate retraining of a classification model when new labeled data arrives. The model is deployed on AI Platform Prediction. Which Google Cloud service should be used to orchestrate the retraining pipeline?

Question 79hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company runs a real-time fraud detection model using Cloud Dataflow for streaming inference. The model is updated every hour with new training data. The team wants to minimize downtime and ensure that both old and new model versions are available during the update. Which deployment strategy should they use?

Question 80mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company uses BigQuery ML to create a classification model. The model is used for batch prediction on a weekly basis. After six months, the data distribution shifts, and model accuracy drops. Which approach should the company take to maintain model performance?

Question 81easymultiple choice

Read the full Operationalizing machine learning models explanation →

A team has trained a scikit-learn model and wants to deploy it to AI Platform Prediction for online predictions. What is the required format for the model artifact?

Question 82hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A real-time recommendation system uses a custom container deployed on AI Platform Prediction. The model requires a large in-memory embedding lookup table that is loaded from Cloud Storage at startup. The current startup time is over 5 minutes, causing prediction requests to timeout. Which strategy would most effectively reduce startup time?

Question 83mediummulti select

Read the full Operationalizing machine learning models explanation →

A data engineering team is building a CI/CD pipeline for machine learning models using Cloud Build and AI Platform. Which TWO practices are essential for ensuring reproducible and safe model deployments?

Question 84hardmulti select

Read the full Operationalizing machine learning models explanation →

A company trains a model using Cloud TPUs. The model is deployed to AI Platform Prediction using a custom container with TensorFlow. Which THREE considerations are most important when serving this model?

Question 85easymulti select

Read the full Operationalizing machine learning models explanation →

A team is deploying a TensorFlow model for online predictions on AI Platform Prediction. They want to monitor for data drift and model performance degradation. Which TWO Google Cloud services should they use?

Question 86hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company has a batch prediction job that runs daily using AI Platform Batch Prediction. The job uses a TensorFlow model and processes 10 GB of data. Recently, the job started failing with the error 'The replica worker 0 exited with a non-zero exit code: Out of memory'. Which action should the team take to resolve this without rewriting the model?

Question 87easymultiple choice

Read the full NAT/PAT explanation →

A data science team needs to ensure that a deployed Vertex AI model can handle varying traffic patterns with minimal latency and cost. What should they do?

Question 88mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A team trained a model on a Vertex AI custom training job and wants to deploy it to an endpoint for online predictions. They have the model artifacts stored in Cloud Storage. What steps are required?

Question 89hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company uses Vertex AI Pipelines to orchestrate ML workflows. They want to automatically retrain the model when new data arrives, but only if the model's performance drops below a threshold. Which approach is best?

Question 90easymultiple choice

Read the full Operationalizing machine learning models explanation →

A team deployed a model to Vertex AI Endpoint and notices latency spikes during peak hours. What should they first investigate?

Question 91mediummultiple choice

Read the full Operationalizing machine learning models explanation →

An MLOps team wants to implement continuous deployment of ML models using Cloud Build and Vertex AI. They have a GitHub repository with training code. What should they use?

Question 92hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company has a model that requires GPU for inference and has strict latency requirements. They deployed on Vertex AI Endpoint with autoscaling but observe cold start latency when scaling up. What is the best solution?

Question 93easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist wants to test a new model version on a small percentage of traffic before full rollout. Which Vertex AI feature allows this?

Question 94mediummultiple choice

Read the full Operationalizing machine learning models explanation →

After deploying a model, the team notices that predictions are significantly different from training data distribution. What should they do?

Question 95hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company uses Vertex AI Feature Store for serving features. They have a high-throughput online serving requirement. Which configuration should they use?

Question 96mediummulti select

Read the full Operationalizing machine learning models explanation →

A company deploys an ML model using Vertex AI Pipelines. They want to ensure reproducibility and traceability. Which TWO practices should they implement?

Question 97mediummulti select

Read the full Operationalizing machine learning models explanation →

A team needs to optimize online prediction cost for a model that has unpredictable traffic spikes. Which TWO strategies are most effective?

Question 98hardmulti select

Read the full Operationalizing machine learning models explanation →

A company wants to implement a robust MLOps lifecycle on Google Cloud. Which THREE components are essential?

Question 99mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. What is the most likely cause of the error?

Exhibit

Error: Vertex AI.Exception: 400 Failed to deploy model to endpoint projects/.../endpoints/1234. Details: The resource 'projects/.../models/5678' is missing an artifact URI. Please upload the model artifact to Cloud Storage and create a new model version.

Question 100hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. The feature store 'my_fs' responds to offline queries but online serving requests fail. What is the most likely cause?

Exhibit

gcloud ai featurestores describe projects/.../locations/us-central1/featurestores/my_fs
Output:
online_serving_config:
  fixed_node_count: 0
  scaling:
    min_node_count: 1
    max_node_count: 10
    cpu_utilization_target: 80
state: STABLE

Question 101easymultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. What is the most likely cause?

Exhibit

$ gcloud ai endpoints predict 1234 --json-request=request.json
Error: (gcloud.ai.endpoints.predict) PREDICTION FAILED: HTTP 400: Model error: The model type is not supported for this prediction method.

Question 102easymultiple choice

Read the full Operationalizing machine learning models explanation →

A startup is deploying a machine learning model for real-time fraud detection. They need low latency and automatic scaling during peak hours. Which Google Cloud service should they use?

Question 103easymultiple choice

Read the full Operationalizing machine learning models explanation →

A team has trained a model using AutoML Tables. They want to deploy it for batch predictions on a schedule. What is the simplest approach?

Question 104easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data engineer needs to monitor model performance over time for drift detection. What tool is specifically designed for this?

Question 105mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A machine learning pipeline uses Vertex AI Pipelines. One component fails intermittently due to resource constraints. What is the best way to handle this?

Question 106mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company uses a custom container image for model serving. The image is large (10 GB). During deployment, they get timeouts. What should they do?

Question 107mediummultiple choice

Read the full Operationalizing machine learning models explanation →

After deploying a model to Vertex AI Endpoints, the prediction responses include unexpected data. The model returns logits instead of probabilities. What is the most likely cause?

Question 108hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A financial services company must ensure that predictions from a deployed model do not become biased against protected groups. They have a monitoring system in place. Which metric should they track?

Question 109hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A team uses Vertex AI Feature Store for real-time features. They notice that features are frequently missing during prediction serving. What is the best practice to handle missing features?

Question 110hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist developed a model using custom training on Vertex AI. They want to automate the entire training-to-deployment process. Which service should they use?

Question 111easymulti select

Read the full Operationalizing machine learning models explanation →

A data engineer is setting up CI/CD for a machine learning model using Cloud Build and Vertex AI. Which two components are essential? (Select 2)

Question 112mediummulti select

Read the full Operationalizing machine learning models explanation →

A company wants to implement model monitoring for a deployed classification model. Which three types of monitoring should they set up? (Select 3)

Question 113hardmulti select

Read the full Operationalizing machine learning models explanation →

A team is deploying a complex model with multiple preprocessing steps. They want to ensure consistent preprocessing during training and serving. Which three approaches can achieve this? (Select 3)

Question 114easymultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. An auditor sees the following output from `gcloud ai models list`. What can they conclude about versioning?

Exhibit

MODEL_ID: my_model
VERSION_ID: v1
DISPLAY_NAME: my_model_v1
STATE: READY
VERSION_UPDATE_TIME: 2023-01-10T12:00:00

MODEL_ID: my_model
VERSION_ID: v2
DISPLAY_NAME: my_model_v2
STATE: READY
VERSION_UPDATE_TIME: 2023-01-15T12:00:00

Question 115mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. A developer sees this log entry when trying to get a prediction. What is the most likely cause?

Exhibit

{
 "severity": "ERROR",
 "message": "Prediction failed: Model 'projects/my-project/models/12345/versions/v1' not found.",
 "timestamp": "2024-01-20T10:00:00Z",
 "request": "POST /v1/projects/my-project/models/12345:predict"
}

Question 116hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. A data engineer sees these metrics from Cloud Monitoring for a deployed Vertex AI Endpoint. What is the most effective action to reduce latency?

Exhibit

Metric: CPU Utilization (Model Endpoint)
Current: 90%
Threshold: 80%
Trend: Increasing

Metric: Prediction Latency (p95)
Current: 1500ms
Threshold: 1000ms
Trend: Increasing

Question 117easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company deploys a machine learning model on Vertex AI for online predictions. The model experiences intermittent spikes in traffic, causing latency increases. Which strategy should the company use to ensure consistent low latency during traffic spikes?

Question 118mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A data engineer deploys a TensorFlow model on Vertex AI using a custom container. After deployment, online prediction requests sometimes fail with a 500 error and the message 'Out of memory'. The model requires significant memory during inference. Which action should the engineer take to resolve this issue?

Question 119hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company is building a continuous training pipeline that retrains a model daily using new data from a feature store. The training data must include features computed up to the timestamp of each training run. Which architecture should be used to ensure time-consistent feature values without label leakage?

Question 120easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company has deployed a classification model on Vertex AI. They want to detect data drift in real-time for the model's input features. Which service should they use?

Question 121mediummultiple choice

Review the full routing breakdown →

A machine learning team wants to deploy a new model version for canary testing, where only 5% of traffic is routed to the new version. Which Vertex AI endpoint configuration supports this?

Question 122hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company needs to serve predictions for a model that runs an expensive computation on each request. The model is used by a batch job that processes millions of records each night, and also by a real-time API for a few thousand queries per hour. Which prediction strategy minimizes cost and latency for both use cases?

Question 123easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist has iterated on a model and produced a new version. The organization requires the ability to roll back to the previous version quickly if the new version performs poorly in production. Which approach should be used?

Question 124mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A Cloud Build pipeline is set up to train a model on Vertex AI. The build fails with the error: 'ERROR: (gcloud.ai-platform.jobs.submit.training) NOT_FOUND: The parent project does not exist.' The project ID and the service account are correctly configured. What is the most likely cause?

Question 125hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company serves multiple models using Vertex AI endpoints. Each model has different latency and memory requirements. To minimize cost, the company wants to share underlying compute resources among models. Which approach should they use?

Question 126easymulti select

Read the full Operationalizing machine learning models explanation →

Which TWO are benefits of using Vertex AI Endpoints for model serving?

Question 127mediummulti select

Read the full Operationalizing machine learning models explanation →

Which THREE steps are required to set up a continuous training pipeline on Google Cloud using Vertex AI?

Question 128hardmulti select

Read the full Operationalizing machine learning models explanation →

Which TWO are common causes of prediction bias in a deployed machine learning model in production?

Question 129easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company needs to deploy a trained model for real-time predictions with low latency. Which Vertex AI resource should they use?

Question 130easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data engineer wants to automatically detect when the distribution of input features to a production model has shifted significantly. Which Vertex AI feature should they enable?

Question 131easymultiple choice

Read the full Operationalizing machine learning models explanation →

A team has multiple versions of a model and wants to manage them centrally, including tracking metadata and promoting versions to production. Which tool should they use?

Question 132mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A production model deployed on Vertex AI Endpoint is experiencing high latency during traffic spikes. The current configuration uses a single replica. What is the most efficient solution?

Question 133mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company wants to automate model retraining and deployment whenever new training data becomes available. Which service should be used to orchestrate the end-to-end workflow?

Question 134mediummultiple choice

Read the full NAT/PAT explanation →

A data scientist needs to provide explanations for each prediction made by a deployed autoML model to comply with regulatory requirements. Which Vertex AI feature should they use?

Question 135hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company runs large batch prediction jobs on Vertex AI every day. They want to minimize costs while ensuring the jobs complete within a 4-hour window. The model requires significant memory. What is the most cost-effective approach?

Question 136hardmultiple choice

Read the full NAT/PAT explanation →

A team is implementing CI/CD for their ML models using Google Cloud. They want to automatically retrain and deploy a new model version when new training data arrives in Cloud Storage. Which combination of services should they use?

Question 137hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A team is training a large model using a custom container with TensorFlow on Vertex AI Training. They need to use multiple GPUs across several machines. Which strategy should they implement to maximize training throughput?

Question 138mediummulti select

Read the full Operationalizing machine learning models explanation →

Which TWO actions should you take to ensure model reliability in a production Vertex AI Endpoint?

Question 139mediummulti select

Read the full Operationalizing machine learning models explanation →

Which THREE Google Cloud services are typically used together in a production ML pipeline?

Question 140hardmulti select

Read the full Operationalizing machine learning models explanation →

Which TWO strategies help reduce prediction latency for a real-time model deployed on Vertex AI Endpoint?

Question 141mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. What is the cause of this error?

Network Topology

Question 142mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. This log entry was generated by Vertex AI Model Monitoring for a production model. What should the data engineer do to address this issue?

Exhibit

{
  "resource": {"type": "ai_platform_endpoint", "labels": {"endpoint_id": "123"}},
  "severity": "ERROR",
  "jsonPayload": {
    "feature_name": "age",
    "monitoring_type": "prediction_drift",
    "drift_score": 0.85,
    "threshold": 0.7
  }
}

Question 143hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. A team is trying to run a custom prediction container on Vertex AI Endpoint. They get this error when the container starts. What is the most likely cause?

Exhibit

Log: "Container failed with error: exec format error. Ensure the container has an entry point."

Question 144easymultiple choice

Read the full Operationalizing machine learning models explanation →

Your company has a machine learning model that predicts customer churn. The model is deployed on Vertex AI Endpoints with autoscaling. After a marketing campaign, traffic to the endpoint increases by 10x. Some predictions start failing with 'HTTP 503 Service Unavailable' errors. What is the most likely cause?

Question 145easymultiple choice

Read the full Operationalizing machine learning models explanation →

You are deploying a machine learning model to production using Vertex AI. The model requires GPU acceleration for low-latency predictions. You need to minimize costs while ensuring availability during a defined business hours window (8 AM to 6 PM). Which deployment strategy should you use?

Question 146easymultiple choice

Read the full Operationalizing machine learning models explanation →

You are responsible for monitoring a production ML model on Vertex AI. The model predicts loan approval probability. The business team reports that the model's predictions are becoming less accurate over the last week. You check the model's monitoring dashboard and see that the prediction distribution has changed significantly. What is the most likely issue?

Question 147mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Your team uses a CI/CD pipeline with Cloud Build to train and deploy ML models on Vertex AI. You want to ensure that only models that pass validation checks (e.g., accuracy threshold, fairness metrics) are promoted to production. What is the best way to implement this?

Question 148mediummultiple choice

Read the full Operationalizing machine learning models explanation →

You deployed a model on Vertex AI Endpoints using a custom container. The model serves predictions but the latency is higher than expected. You suspect the container is not making full use of the CPU resources. What should you do to reduce latency?

Question 149mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Your organization uses Vertex AI Feature Store to serve features for a real-time fraud detection model. The model is deployed on a Vertex AI endpoint. After a data pipeline update, the model's online predictions became inconsistent. What is the most likely cause?

Question 150hardmultiple choice

Review the full routing breakdown →

You manage a team that deploys multiple versions of a computer vision model for A/B testing on Vertex AI Endpoints. You need to route a small percentage of traffic to a canary version while the rest goes to the stable version. You also need to gradually increase the canary traffic over time based on performance metrics. Which approach should you take?

Question 151hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Your company uses Vertex AI Pipelines to automate the ML lifecycle. The pipeline includes training, evaluation, and deployment steps. You want to ensure that if a pipeline run fails due to a transient error (e.g., resource quota shortage), it automatically retries before marking the run as failed. What is the best way to implement this?

Question 152hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You are designing a system to serve predictions from a large language model (LLM) with a latency SLO of 500ms. The model does not fit on a single GPU and requires model parallelism. You are considering using Vertex AI Endpoints with a custom container. What additional setup is required to achieve the latency target?

Question 153mediummulti select

Read the full Operationalizing machine learning models explanation →

Which TWO configurations are required to enable online prediction for a model deployed on Vertex AI Endpoints?

Question 154mediummulti select

Read the full Operationalizing machine learning models explanation →

Which THREE metrics should be monitored to detect model drift in a production ML system?

Question 155hardmulti select

Read the full Operationalizing machine learning models explanation →

Which THREE steps are essential for implementing a continuous training pipeline with Vertex AI?

Question 156hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Your company runs a real-time recommendation system for a popular e-commerce website using a machine learning model deployed on Vertex AI Endpoints. The model takes user features and product catalog data as input and returns top-10 product recommendations. The system uses a feature store to serve user embeddings and product embeddings. Recently, the recommender team retrained the model with a new algorithm and deployed it as a new version. Since the deployment, the latency for recommendation requests has increased from 100ms to 500ms on average, exceeding the 200ms SLO. The model accuracy is acceptable, and there are no errors. The endpoint uses an n1-standard-8 machine with a single GPU. The new model is larger but still fits on the GPU. You investigate and find that the GPU utilization remains low (<20%), but CPU utilization is high (90%). What should you do to reduce latency while maintaining accuracy?

Question 157hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You are a data engineer at a financial services company that uses Vertex AI to train and deploy models for credit risk assessment. The company has strict governance requirements: every model version must be approved by the risk committee before going to production. The approval process can take several days. Currently, the team trains a new model weekly and manually deploys it to a staging endpoint for review, then manually promotes to production after approval. This process is error-prone and slow. You want to automate the pipeline: training should trigger automatically when new data arrives, the model should be automatically deployed to a staging endpoint for review, and after manual approval, it should be promoted to production. Additionally, you need to ensure that if a model in staging performs poorly (e.g., low accuracy), it should not be promoted even if approved. What should you do?

Question 158easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company deploys a scikit-learn model on Vertex AI for online predictions. The model is packaged in a custom container with all dependencies. Users report high latency (over 5 seconds) for predictions. The model size is 2 GB. What is the most likely cause of the high latency?

Question 159easymultiple choice

Read the full Operationalizing machine learning models explanation →

A data scientist trains a TensorFlow model using Vertex AI Training and wants to deploy it for online prediction. Which Vertex AI resource should the data scientist use to create an endpoint for serving predictions?

Question 160mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A company has a production model deployed on Vertex AI that shows declining accuracy over time. The model uses features from a BigQuery feature store. The data science team suspects data drift. What is the most efficient way to monitor and detect drift for this model?

Question 161mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A team uses Vertex AI Pipelines to automate retraining of a model every month. The pipeline includes data preprocessing, training, and deployment steps. After a recent update, the pipeline fails intermittently with a timeout error during the deployment step. What is the most likely cause?

Question 162hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A financial services company deploys a fraud detection model on Vertex AI using a custom prediction container that runs a PyTorch model. The model requires GPU acceleration. The deployment succeeds but predictions return an error: 'CUDA error: out of memory'. What should the team do to resolve this issue?

Question 163hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A company uses Vertex AI Feature Store for serving features to both training and prediction. The team notices that predictions made shortly after training use different feature values, causing a training-serving skew. What is the most effective way to prevent this skew?

Question 164easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company wants to version its ML models and track lineage from training data to deployed model. Which Google Cloud service should they use?

Question 165mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A data science team wants to deploy a model that requires a custom container with specific NVIDIA CUDA version. They build the image and push to Artifact Registry. When deploying to Vertex AI, the model fails to load with an error: 'Failed to start container: invalid ELF header'. What is the most likely cause?

Question 166easymulti select

Read the full Operationalizing machine learning models explanation →

A company is designing a CI/CD pipeline for their ML models using Cloud Build and Vertex AI. Which TWO practices should they adopt to ensure reliable and reproducible deployments?

Question 167mediummulti select

Read the full Operationalizing machine learning models explanation →

A team monitors a deployed Vertex AI model and notices an increasing number of prediction errors with status code 413 (Request Entity Too Large). Which TWO actions should they consider to resolve this issue?

Question 168hardmulti select

Read the full Operationalizing machine learning models explanation →

During a Vertex AI training pipeline, the training job fails with an error: 'Out of memory: Killed process'. The model is a large deep learning model using TensorFlow. Which THREE steps should the team take to resolve this issue?

Question 169easymultiple choice

Read the full Operationalizing machine learning models explanation →

Your company deploys a classification model on Vertex AI for online predictions. The model is an XGBoost model trained on tabular data with 500 features. The endpoint uses a single n1-standard-4 node. After deployment, users report that predictions take 8-10 seconds on average, while the required SLA is under 2 seconds. You have already verified that the model is not large (under 100 MB) and the input data size is small. The endpoint does not scale automatically. Which action should you take to reduce latency to meet the SLA? A) Change the machine type to n1-highcpu-4 to prioritize compute over memory. B) Enable autoscaling by setting min replicas to 2 and max replicas to 5. C) Switch to a custom container that preloads the model into memory. D) Reduce the number of features by half.

Question 170mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A retail company uses Vertex AI Pipelines to automate monthly retraining of a recommendation model. The pipeline consists of three steps: (1) extract data from BigQuery, (2) train a TensorFlow model on Vertex AI Training, (3) upload the model to Vertex AI Model Registry and deploy to an endpoint if performance metrics improve. Recently, the pipeline has been failing at step 2 with the error: 'The job was cancelled by the system because it exceeded the maximum training time of 3600 seconds.' You have confirmed that the training code is correct and the data size has not changed significantly. What should you do to fix this pipeline failure? A) Reconfigure the pipeline to use a larger machine type for training. B) Set the training timeout to 7200 seconds in the pipeline configuration. C) Reduce the training dataset size by sampling fewer rows. D) Switch from TensorFlow to a simpler model framework.

Question 171hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A healthcare company deploys a model for diagnosing medical images on Vertex AI using a custom container with a TensorFlow model. The model uses a mixture of GPUs (NVIDIA T4) and CPUs. After deployment, you notice that prediction latency is highly variable: sometimes under 100ms, sometimes over 10 seconds. Investigation shows that the variability correlates with the number of concurrent requests. The endpoint has a min replicas of 1 and max replicas of 3, with target CPU utilization set to 80%. You also observe that GPU utilization remains low (<20%) even during high load. What is the most likely cause of the latency variability? A) The model is not fully utilizing GPUs due to inefficient data loading from CPU. B) The autoscaling metric (CPU utilization) is not appropriate for a GPU-bound workload; the endpoint does not scale based on GPU utilization. C) The GPU machine type is too small for the model. D) The container is not configured to use the GPU correctly.

Question 172hardmultiple choice

Review the full routing breakdown →

An e-commerce company uses Vertex AI to serve a real-time personalization model. The model is updated daily via a retraining pipeline that uploads a new version to the same endpoint. Recently, after a model update, the online prediction responses have been returning anomalous results (e.g., recommending irrelevant products). The previous version performed well. The team suspects that the new model is undercooked or has a bug. They have already checked the training code and the pipeline logs, which show no errors. The pipeline deploys the new model version to the endpoint by updating the traffic split to route 100% of traffic to the new version. Which course of action should the team take to quickly mitigate the issue while diagnosing the root cause? A) Roll back the endpoint to the previous model version by setting traffic split to 0% for the new version. B) Delete the current endpoint and recreate it with the previous model version. C) Tweak the training hyperparameters and retrain immediately. D) Increase the number of replicas on the endpoint to handle load.

Question 173mediummulti select

Read the full Operationalizing machine learning models explanation →

A data science team has deployed a custom TensorFlow model on Vertex AI Prediction. They notice increasing prediction latency and a growing number of 503 errors during peak traffic hours. The model is served using a single regional endpoint with min replica count of 2 and max replica count of 10. Which TWO actions should the team take to address these issues?

Question 174hardmulti select

Read the full Operationalizing machine learning models explanation →

An MLOps team manages a pipeline that retrains an XGBoost classifier weekly using BigQuery data. The pipeline is orchestrated with Cloud Composer and deploys the new model to Vertex AI Endpoint if validation metrics (AUC > 0.9) are met. Over the past month, the deployed model's AUC has dropped from 0.95 to 0.88, despite the training pipeline consistently reporting AUC > 0.9. Which THREE steps should the team take to diagnose and fix this issue?

Question 175easymultiple choice

Read the full Operationalizing machine learning models explanation →

Your company has deployed a machine learning model on Vertex AI Endpoint to serve real-time predictions for a mobile application. The model was trained using TensorFlow and the prediction requests include raw images that are preprocessed by the client before sending. Recently, the application developers reported that the predictions are becoming less accurate over time. They suspect the issue is related to changes in the client-side preprocessing code. You need to verify this hypothesis and monitor for future regressions. What should you do?

Question 176mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Your team is responsible for operationalizing a series of machine learning models that are trained and deployed using Vertex AI Pipelines. The pipeline consists of several steps including data preprocessing, training with hyperparameter tuning, model evaluation, and deployment to an endpoint. Recently, the pipeline has been failing intermittently at the model evaluation step with an error indicating insufficient memory. The evaluation step uses a custom container with a memory limit of 4 GB. The training step uses 8 GB and completes successfully. You need to resolve the failure without drastically increasing costs. What should you do?

Question 177hardmultiple choice

Read the full Operationalizing machine learning models explanation →

You manage a large-scale machine learning system that recommends products to users. The model is a deep neural network trained on TensorFlow and deployed on Vertex AI Endpoint with global load balancing. The model receives over 10,000 requests per second. Recently, the team added a new feature: the user's current geographic location (latitude/longitude). After deploying the updated model, you notice that the average prediction latency has doubled, and the error rate has increased, particularly for requests from regions far from the model's primary training data (North America). You suspect the location feature is causing issues. What should you do to diagnose and mitigate the problem?

Question 178easymultiple choice

Read the full Operationalizing machine learning models explanation →

A startup is using Cloud Build to automate the training and deployment of their machine learning models. The workflow is defined in cloudbuild.yaml and includes steps to: 1) Run a training job on AI Platform Training, 2) Build a custom prediction container, 3) Deploy the container to Cloud Run for serving. The deployment step fails intermittently with the error: 'Cloud Run service already exists and is not owned by the calling user.' You need to fix this so that deployments are reliable. What should you do?

Question 179mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Your organization deploys multiple versions of the same model to Vertex AI Endpoint for A/B testing. You have a production model (v1) serving 90% of traffic and a candidate model (v2) serving 10%. After one week, you observe that v2 has a slightly lower AUC but significantly higher business metrics like click-through rate. The product team wants to gradually increase v2's traffic. However, you need to ensure that the overall prediction latency remains under 200 ms. Currently, the endpoint has 10 replicas for v1 and 2 replicas for v2. What is the best approach to roll out v2 while maintaining latency SLO?

Question 180hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A financial services company uses a custom container on Vertex AI Prediction to serve a fraud detection model. The container runs a Flask app that loads a large feature engineering library (~2 GB) at startup. The model is updated weekly. For the past two weeks, the new model version has been failing health checks and showing 'Container failed to start' errors in the logs. The previous versions worked fine. You inspect the container image and confirm it is built correctly using Cloud Build. The only change in the latest build is an updated version of the feature engineering library. What is the most likely cause and how should you fix it?

Question 181mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Your team has implemented a CI/CD pipeline using Cloud Composer (Apache Airflow) to retrain a model every day. The pipeline reads new data from BigQuery, trains a model using Vertex AI Training, evaluates it, and if the accuracy improves, deploys it to a Vertex AI Endpoint. For the past week, the pipeline has been running successfully but no new model has been deployed because the evaluation accuracy never exceeds the previous model's accuracy. The training data volume has been consistent. You suspect that the model is not learning from the new data. What should you do?

Question 182easymultiple choice

Read the full Operationalizing machine learning models explanation →

Your company runs batch predictions using Vertex AI Batch Prediction on a monthly basis. The predictions are used to generate customer segments for marketing campaigns. This month, the batch prediction job failed with an error: 'The number of rows in the input table does not match the number of rows in the output table.' The input table in BigQuery has 5 million rows, but the output table has only 4.5 million rows. You need to identify and handle the missing predictions. What is the most efficient course of action?

Question 183hardmultiple choice

Read the full Operationalizing machine learning models explanation →

Your team manages a multi-model ensemble deployed on Vertex AI Endpoint. The ensemble consists of three models: a neural network (NN), a gradient boosted tree (GBT), and a logistic regression (LR). They are deployed as separate endpoints and traffic is split using a traffic split configuration. Recently, the overall accuracy dropped from 92% to 85%. Monitoring shows that the NN model's latency has increased significantly, causing it to miss timeouts and fall back to default predictions. The other two models are performing normally. The NN model is the most complex and handles the majority of the traffic. You need to restore accuracy quickly. What should you do first?

Question 184easymultiple choice

Read the full Operationalizing machine learning models explanation →

A company deploys a new machine learning model for real-time predictions using Vertex AI. The model is stored in a Cloud Storage bucket and deployed to an endpoint. To ensure traceability and rollback capability, which practice should be followed?

Question 185mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A team notices that the latency for online predictions from a Vertex AI endpoint has increased significantly over the past hour. The model is a large TensorFlow model deployed with automatic scaling (minReplicaCount=2, maxReplicaCount=10). The CPU utilization of the deployed instances is consistently above 85%. What is the most likely cause of the increased latency?

Question 186hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A financial services company uses Vertex AI to serve a fraud detection model. The model was trained on historical data that is updated daily. The team wants to automate retraining when data drift is detected. Which approach best operationalizes this requirement with minimal manual intervention?

Question 187mediummultiple choice

Read the full Operationalizing machine learning models explanation →

A retail company needs to generate product recommendations for millions of users every few hours. The model is a small scikit-learn model. Which prediction method should be used to minimize infrastructure cost while meeting the latency requirements?

Question 188mediummulti select

Read the full Operationalizing machine learning models explanation →

A company deploys a TensorFlow model on Vertex AI for online predictions. They want to monitor model performance in production to detect degradation. Which TWO practices should they implement? (Choose 2.)

Question 189hardmulti select

Read the full Operationalizing machine learning models explanation →

A data science team uses Cloud Build and Vertex AI to implement CI/CD for their machine learning models. Which THREE steps are essential for a production-ready operationalization pipeline? (Choose 3.)

Question 190mediummultiple choice

Read the full Operationalizing machine learning models explanation →

Refer to the exhibit. A data scientist deploys a model using this configuration. Users report that after a few hours of inactivity, the first prediction request takes over 30 seconds. What is the most likely cause?

Network Topology

Question 191hardmultiple choice

Read the full Operationalizing machine learning models explanation →

A healthcare company uses Vertex AI to deploy a medical image classification model. The model is deployed on a private endpoint with automatic scaling (minReplicaCount=2, maxReplicaCount=10). The model uses a custom container with a GPU for inference. Recently, during peak business hours (9 AM - 5 PM), users report that prediction requests frequently time out after 60 seconds, and the error rate increases. The team checks Cloud Monitoring and observes that CPU utilization averages 40%, GPU utilization averages 30%, and the number of replicas stays at 2. There are no errors in the container logs. The model serves a few hundred requests per second during peak. The team suspects the issue is not resource saturation but something else. What should they do to resolve the problem?