PMLE · topic practice

Collaborating to manage data and models practice questions

Practise Google Professional Machine Learning Engineer Collaborating to manage data and models practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security
20 questionsDomain: Collaborating to manage data and models

What the exam tests

What to know about Collaborating to manage data and models

Collaborating to manage data and models questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common Collaborating to manage data and models exam traps

  • Answering from memory before reading the full scenario.
  • Missing a constraint such as cost, availability, security, scope or command context.
  • Choosing a broad answer when the question asks for the most specific fix.
  • Ignoring why the wrong options are tempting.

Practice set

Collaborating to manage data and models questions

20 questions · select your answer, then reveal the explanation

Question 1mediummultiple choice
Read the full NAT/PAT explanation →

A data science team uses BigQuery to store raw data and Vertex AI for model training. They want to ensure that only authorized users can access training data, and that model artifacts are automatically versioned and tracked. Which combination of Google Cloud services should they use?

An ML team uses Vertex AI Pipelines to automate model retraining. The pipeline includes a step that queries BigQuery to create a training dataset. The team notices that the pipeline fails intermittently with a '403 Exceeded rate limits' error. What is the most likely cause and solution?

A company stores training data in Cloud Storage and uses Vertex AI Training for model training. They want to implement a data validation pipeline to detect data drift before retraining. Which service should they use?

A team uses Vertex AI Feature Store to serve features for real-time predictions. They notice that feature values are frequently updated from multiple source systems, leading to inconsistencies. They need to ensure that feature values are consistent across all serving endpoints. What should they do?

An organization uses Cloud Composer to orchestrate ML workflows. A DAG that triggers Vertex AI training jobs fails because the training job exceeds the 7-day maximum runtime. What is the best way to handle long-running training jobs in Cloud Composer?

A team wants to share a trained model with other teams within the organization. They need to provide access to the model artifact in Vertex AI Model Registry and ensure that only authorized teams can deploy the model. What should they do?

A data scientist is using Vertex AI Workbench user-managed notebooks. They need to collaborate with a colleague on the same notebook. The colleague should be able to edit the notebook simultaneously. What should they do?

A team uses Vertex AI Pipelines with CustomJob components that pull training code from a Cloud Source Repository. The pipeline fails with a 'Permission denied' error when trying to access the repository. The service account used by the pipeline has the 'Source Repository Viewer' role. What is the likely issue?

Which TWO statements about Vertex AI Feature Store are correct? (Choose 2)

Which THREE actions are best practices for managing ML models in production on Google Cloud? (Choose 3)

Which TWO factors should you consider when choosing between BigQuery and Cloud Storage for storing training data? (Choose 2)

A financial services company uses Vertex AI to deploy multiple models for fraud detection. The ML team has set up a CI/CD pipeline using Cloud Build and Cloud Deploy. The pipeline builds a custom container with the trained model, pushes it to Artifact Registry, and deploys it to a Vertex AI Endpoint. Recently, a new regulation requires that all model deployments be audited and approved by the compliance team before going live. The compliance team wants to review the model's evaluation metrics and approve the deployment via a ticketing system. Currently, the CI/CD pipeline automatically deploys after the container is built. The team needs to implement a gating process without slowing down the development cycle. What should they do?

Question 13mediummultiple choice
Read the full NAT/PAT explanation →

A healthcare organization is building a machine learning model to predict patient readmission risk. They have sensitive data stored in BigQuery that includes protected health information (PHI). The data science team uses Vertex AI Workbench notebooks to explore the data and develop models. The organization's security policy requires that all PHI data must be encrypted at rest and in transit, and that access to the data is logged and audited. They also need to ensure that the data used for model training is de-identified to remove direct identifiers such as patient names and SSNs. The team wants to automate the de-identification process as part of the data pipeline. Which approach meets these requirements?

Drag and drop the steps to deploy a trained TensorFlow model to Vertex AI Prediction in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5

Match each regularization technique to its effect.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Adds absolute value of weights to loss, induces sparsity

Adds squared magnitude of weights to loss, prevents overfitting

Randomly drops units during training to prevent co-adaptation

Stops training when validation performance stops improving

Increases training data diversity through transformations

A team of ML engineers is collaborating on a project using Vertex AI. They want to ensure that only approved models are deployed to production. Which approach should they use?

A company uses a Cloud Composer DAG to run a daily ML pipeline that includes Dataflow jobs and model training on Vertex AI. The pipeline frequently fails due to insufficient permissions when the Dataflow worker accesses data in Cloud Storage. What is the most efficient way to resolve this issue?

A data scientist wants to share a trained model with the team for review before deployment. The model is stored in Vertex AI Model Registry. What is the recommended way to grant the team read access to the model?

Your team is using Vertex AI Feature Store for online predictions. You notice that feature values for some entities are missing in production, leading to failed predictions. Upon investigation, you find that the ingestion pipeline has been failing intermittently. What is the best immediate course of action to prevent prediction failures?

A team of ML engineers is building a real-time fraud detection system. They use Cloud Pub/Sub to stream transactions, Dataflow for feature engineering, and Vertex AI to get predictions. They want to ensure that the data used for training matches the data used for serving to avoid training-serving skew. Which approach should they take?

Free account

Track your progress over time

Create a free account to save your results and see which topics improve across sessions.

Focused Collaborating to manage data and models sessions

Start a Collaborating to manage data and models only practice session

Every question in these sessions is drawn from the Collaborating to manage data and models domain — nothing else.

Related practice questions

Related PMLE topic practice pages

Move into related areas when this topic feels solid.

Frequently asked questions

What does the PMLE exam test about Collaborating to manage data and models?
Collaborating to manage data and models questions test whether you can apply the concept in context, not just recognise a definition.
How should I use these practice questions?
Select your answer before revealing the explanation. Then read why each option is right or wrong — this active recall approach builds retention far faster than re-reading notes.
Can I practise just Collaborating to manage data and models questions in a focused session?
Yes — the session launcher on this page draws every question from the Collaborating to manage data and models domain. Use a 10-question session first to gauge your baseline, then move to 20 or 30 once the weak spots are clear.
Where can I practise other PMLE topics?
Use the topic links above to move to related areas, or go back to the PMLE question bank to see all topics.
Are these real exam questions or dumps?
These are original practice questions written to test the same concepts the PMLE exam covers. They are not copied from any real exam or dump site.