Practice PMLE Collaborating to manage data and models questions with full explanations on every answer.
Start practicing
Collaborating to manage data and models — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A data science team uses BigQuery to store raw data and Vertex AI for model training. They want to ensure that only authorized users can access training data, and that model artifacts are automatically versioned and tracked. Which combination of Google Cloud services should they use?
2An ML team uses Vertex AI Pipelines to automate model retraining. The pipeline includes a step that queries BigQuery to create a training dataset. The team notices that the pipeline fails intermittently with a '403 Exceeded rate limits' error. What is the most likely cause and solution?
3A company stores training data in Cloud Storage and uses Vertex AI Training for model training. They want to implement a data validation pipeline to detect data drift before retraining. Which service should they use?
4A team uses Vertex AI Feature Store to serve features for real-time predictions. They notice that feature values are frequently updated from multiple source systems, leading to inconsistencies. They need to ensure that feature values are consistent across all serving endpoints. What should they do?
5An organization uses Cloud Composer to orchestrate ML workflows. A DAG that triggers Vertex AI training jobs fails because the training job exceeds the 7-day maximum runtime. What is the best way to handle long-running training jobs in Cloud Composer?
6A team wants to share a trained model with other teams within the organization. They need to provide access to the model artifact in Vertex AI Model Registry and ensure that only authorized teams can deploy the model. What should they do?
7A data scientist is using Vertex AI Workbench user-managed notebooks. They need to collaborate with a colleague on the same notebook. The colleague should be able to edit the notebook simultaneously. What should they do?
8A team uses Vertex AI Pipelines with CustomJob components that pull training code from a Cloud Source Repository. The pipeline fails with a 'Permission denied' error when trying to access the repository. The service account used by the pipeline has the 'Source Repository Viewer' role. What is the likely issue?
9Which TWO statements about Vertex AI Feature Store are correct? (Choose 2)
10Which THREE actions are best practices for managing ML models in production on Google Cloud? (Choose 3)
11Which TWO factors should you consider when choosing between BigQuery and Cloud Storage for storing training data? (Choose 2)
12A financial services company uses Vertex AI to deploy multiple models for fraud detection. The ML team has set up a CI/CD pipeline using Cloud Build and Cloud Deploy. The pipeline builds a custom container with the trained model, pushes it to Artifact Registry, and deploys it to a Vertex AI Endpoint. Recently, a new regulation requires that all model deployments be audited and approved by the compliance team before going live. The compliance team wants to review the model's evaluation metrics and approve the deployment via a ticketing system. Currently, the CI/CD pipeline automatically deploys after the container is built. The team needs to implement a gating process without slowing down the development cycle. What should they do?
13A healthcare organization is building a machine learning model to predict patient readmission risk. They have sensitive data stored in BigQuery that includes protected health information (PHI). The data science team uses Vertex AI Workbench notebooks to explore the data and develop models. The organization's security policy requires that all PHI data must be encrypted at rest and in transit, and that access to the data is logged and audited. They also need to ensure that the data used for model training is de-identified to remove direct identifiers such as patient names and SSNs. The team wants to automate the de-identification process as part of the data pipeline. Which approach meets these requirements?
14Drag and drop the steps to deploy a trained TensorFlow model to Vertex AI Prediction in the correct order.
15Match each regularization technique to its effect.
16A team of ML engineers is collaborating on a project using Vertex AI. They want to ensure that only approved models are deployed to production. Which approach should they use?
17A company uses a Cloud Composer DAG to run a daily ML pipeline that includes Dataflow jobs and model training on Vertex AI. The pipeline frequently fails due to insufficient permissions when the Dataflow worker accesses data in Cloud Storage. What is the most efficient way to resolve this issue?
18A data scientist wants to share a trained model with the team for review before deployment. The model is stored in Vertex AI Model Registry. What is the recommended way to grant the team read access to the model?
19Your team is using Vertex AI Feature Store for online predictions. You notice that feature values for some entities are missing in production, leading to failed predictions. Upon investigation, you find that the ingestion pipeline has been failing intermittently. What is the best immediate course of action to prevent prediction failures?
20A team of ML engineers is building a real-time fraud detection system. They use Cloud Pub/Sub to stream transactions, Dataflow for feature engineering, and Vertex AI to get predictions. They want to ensure that the data used for training matches the data used for serving to avoid training-serving skew. Which approach should they take?
21You are using Cloud Datalab for collaborative data exploration with your team. However, some team members cannot access the Datalab instances. What is the most likely issue?
22A company trains models using Vertex AI Training and wants to share the resulting model artifacts with a different team in another Google Cloud project. What is the most secure way to grant access?
23A company uses Cloud Composer to orchestrate an ML pipeline. They notice that the pipeline occasionally fails because the Composer environment runs out of disk space on the worker nodes. The pipeline uses many large dependencies. What is the most effective long-term solution?
24Your team is using Vertex AI Pipelines to build an automated training pipeline. You need to share the pipeline definition with another team so they can run it in their own project. Which format should you use?
25Which TWO actions are recommended for collaborating on machine learning models using Vertex AI Model Registry?
26Which TWO strategies help ensure data consistency when multiple teams are contributing features to a shared Vertex AI Feature Store?
27Which THREE practices improve collaboration when using Cloud Composer for ML pipelines?
28Refer to the exhibit. A user receives the error shown when trying to upload a model to Vertex AI. What is the most likely cause?
29Refer to the exhibit. A user is trying to upload a Vertex AI pipeline definition. The error indicates an invalid dependency order. What should the user do to fix this?
30Refer to the exhibit. An ML engineer in the team needs to deploy the model to an endpoint. The engineer is assigned the 'roles/aiplatform.user' role at the project level but still cannot deploy. What is the most likely reason?
31A data science team uses Vertex AI Workbench and wants to share notebooks with version history. Which service should they use?
32A team uses Vertex AI Pipelines. They need to ensure that only certain team members can deploy models to production. What is the best approach?
33A company has multiple teams working on different models. They want to enforce consistent data preprocessing steps across all teams. Which approach should they take?
34A data scientist wants to track the lineage of a dataset used in a training run. Which Vertex AI feature should they use?
35An MLOps team needs to automatically retrain a model when new training data becomes available. They use Vertex AI Pipelines. What is the recommended way to trigger the pipeline?
36A large organization uses a multi-project setup with a central data lake. Different teams manage their own models. To enable cross-team sharing of features, they want to use Vertex AI Feature Store. What is the best practice to manage access?
37When distributing training across multiple workers using Vertex AI Training, how should the team share the training dataset?
38A team is using Vertex AI Experiments to compare different hyperparameters. They want to automatically record the hyperparameters. What is the correct way?
39A data engineering team uses Dataflow for preprocessing and wants to integrate with Vertex AI Pipelines. They need to pass the preprocessed data location to the training step. What is the best practice?
40Which TWO practices help ensure reproducible ML experiments?
41Which TWO tools can be used to collaborate on feature definitions across teams?
42Which THREE actions should be taken to manage model versions effectively?
43Refer to the exhibit. A team member complains they cannot deploy a model to Vertex AI Endpoints. What is the most likely reason?
44Refer to the exhibit. The team wants to automatically deploy the best-performing model version to production. They have set up a Cloud Function triggered by Model Registry events. Which alias should they use in the function to get the latest champion?
45Refer to the exhibit. The team notices that the pipeline fails to read data from the specified Cloud Storage path. What is the most likely issue?
46A team uses Vertex AI Feature Store for storing features. They want to share feature definitions with other teams in a collaborative manner. What is the best way to collaborate on feature definitions?
47A company uses BigQuery to store feature data for ML training. A data engineer notices that a Vertex AI Training job is failing with 'Access Denied' errors when reading from a BigQuery table. The training job uses a custom service account that has been granted the 'bigquery.dataViewer' role on the dataset. What is the most likely cause of the failure?
48A data science team uses Vertex AI Pipelines to orchestrate ML training. They notice that some pipeline runs are failing because of inconsistent data schemas. They want to enforce schema validation as a gate before the training step executes. Which approach should they implement?
49A data scientist wants to share a trained model with colleagues for evaluation. The model is stored as a Vertex AI Model resource. What is the recommended way to share the model without exposing the underlying project?
50A team uses Vertex AI Feature Store for online serving. They notice high latency during peak hours. They have configured the feature store with Bigtable as the online serving store. What is the most likely cause of the high latency?
51An organization uses Cloud Dataflow to preprocess training data. Dataflow jobs are often failing because of insufficient quota for certain resources. The team has requested a quota increase, but the jobs still fail with 'quota exceeded' errors for a different resource. They want to proactively monitor and manage quotas to avoid failures. What is the best approach?
52Which TWO of the following are best practices for managing data in a collaborative machine learning environment on Google Cloud?
53Which THREE of the following are recommended practices for model governance and lineage in Vertex AI?
54A financial services company uses Vertex AI to build credit risk models. They have a team of 10 data scientists and 3 ML engineers. They use multiple notebooks in Vertex AI Workbench, storing data in Cloud Storage and BigQuery. The team reports that training jobs sometimes fail with 'Permission denied' errors when reading from certain Cloud Storage buckets. The error occurs intermittently and only for some users. The team uses custom service accounts for each user's notebook instance, but the permissions seem inconsistent. The IT security team has enforced that all service accounts must have least privilege. What is the most effective course of action to resolve the permission issues while maintaining security?
55A healthcare startup is developing a diagnostic model using sensitive patient data. They use Vertex AI to manage the training pipeline. They need to ensure that the data is encrypted both at rest and in transit. Additionally, they want to prevent the ML engineers from seeing raw data but still allow them to train models. They use Cloud Storage with CMEK and VPC-SC. They plan to use Vertex AI Training with a custom service account. The data stored in Cloud Storage is encrypted with CMEK. What additional step is needed to allow Vertex AI Training to access the encrypted data?
56A retail company uses Vertex AI AutoML to train a product recommendation model. They have a dataset of past purchases stored in BigQuery. The data science team wants to iteratively train and improve the model. They need to track which dataset version was used for each model and preserve the exact data for reproducibility. They currently export data to CSV files and store them in Cloud Storage. However, the dataset is updated daily, and they want to ensure that models are trained on a consistent snapshot. What should they do?
57A large e-commerce company deploys multiple ML models on Vertex AI Endpoints. They use Vertex AI Model Registry to manage model versions. Recently, a team accidentally deployed an unvalidated model to production, causing a service outage. They want to implement a governance process where models must pass certain validation checks before deployment. The validation includes unit tests, fairness checks, and performance benchmarks. They use CI/CD pipelines (Cloud Build). They also need to allow manual approval for critical models. Which combination of Vertex AI features and Cloud Build steps would enforce the required governance?
The Collaborating to manage data and models domain covers the key concepts tested in this area of the PMLE exam blueprint published by Google Cloud. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all PMLE domains — no account required.
The Courseiva PMLE question bank contains 57 questions in the Collaborating to manage data and models domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Collaborating to manage data and models domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included