Knowledge + Practice

CCNA Pmle Collaboration Data Questions

75 of 79 questions · Page 1/2 · Pmle Collaboration Data topic · Answers revealed

Practice these questions Exam hub All questions

1

MCQmedium

You are configuring a Vertex AI Feature Store online store for a real-time recommendation system that requires single-digit millisecond latency and high throughput. The feature values are updated frequently. Which online store type should you use?

A.Spanner online store

B.Bigtable online store

C.Firestore online store

D.Optimized online store

AnswerD

Optimized online store uses Cloud Bigtable with automatic scaling and lower cost, ideal for high-throughput real-time systems.

Why this answer

The optimized online store (backed by Cloud Bigtable) is designed for high throughput, low latency, and frequent updates, making it suitable for real-time systems. Bigtable online store is the legacy option with similar performance but higher cost and manual scaling.

Practice this question →

2

MCQeasy

A team wants to track the lineage of ML pipeline runs, including which datasets, parameters, and models were used in each execution. Which Vertex AI service should they use?

A.Vertex AI Metadata

B.Vertex AI Feature Store

C.Vertex AI Model Registry

D.Vertex AI Experiments

AnswerA

Metadata store is designed for lineage tracking.

Why this answer

Vertex AI Metadata (part of Vertex ML Metadata) tracks artifacts, executions, and lineage.

Practice this question →

3

Multi-Selectmedium

An ML team uses Delta Lake on Dataproc for data versioning. Which THREE benefits does Delta Lake provide?

Select 3 answers

A.Automatic data encryption at rest

B.Time travel for accessing previous versions

C.Schema enforcement and evolution

D.ACID transactions on data lakes

E.Built-in real-time streaming

AnswersB, C, D

Enables reproducibility.

Why this answer

Delta Lake provides ACID transactions, schema enforcement, and time travel.

Practice this question →

4

MCQeasy

A company uses Vertex AI Model Registry to manage multiple model versions. They want to designate a model version as 'champion' for production deployment and another as 'challenger' for A/B testing. Which feature of the registry should they use?

A.Model version labels

B.Model lineage

C.Model aliases

D.Model evaluation metrics

AnswerC

Aliases like 'champion' and 'challenger' can be assigned to versions and used for deployment.

Why this answer

Model Registry aliases allow tagging model versions with labels like 'champion' or 'challenger', enabling easy routing and comparison.

Practice this question →

5

MCQmedium

A data engineer wants to create a BigQuery table snapshot for point-in-time recovery of a critical dataset. The snapshot should be created daily and retained for 30 days. What should they use?

A.BigQuery copy job

B.BigQuery time travel

C.BigQuery scheduled queries with CREATE SNAPSHOT

D.BigQuery export to Cloud Storage

AnswerC

Scheduled queries can create snapshots daily; retention can be set in the snapshot definition.

Why this answer

BigQuery table snapshots are created using the CREATE SNAPSHOT statement and can be scheduled with a retention period.

Practice this question →

6

MCQmedium

A team wants to share feature definitions across multiple projects in their organization using Vertex AI Feature Store. What is the recommended approach?

A.Export features to BigQuery datasets in each project

B.Use Vertex AI Feature Store's feature view for cross-project access

C.Create separate feature stores in each project and synchronize them with Dataflow

D.Use a centralized feature store in a shared project and grant access to other projects via IAM

AnswerD

Centralized feature store with IAM enables controlled sharing.

Why this answer

Vertex AI Feature Store supports cross-project sharing by registering the feature store at the organization level, allowing features to be accessed from different projects.

Practice this question →

7

MCQhard

A machine learning pipeline in Vertex AI produces a dataset artifact, a trained model, and evaluation metrics. The team wants to query the lineage to find all downstream artifacts that depend on a particular dataset. Which Vertex AI service should they use?

A.Vertex AI Feature Store

B.Vertex AI Experiments

C.Vertex AI Model Registry

D.Vertex AI Metadata

AnswerD

Metadata stores lineage between artifacts and executions, enabling queries for upstream/downstream dependencies.

Why this answer

Vertex AI Metadata tracks artifacts, executions, and their lineage relationships. It supports lineage queries to find upstream and downstream dependencies.

Practice this question →

8

MCQeasy

An ML team wants to automatically track training runs, including hyperparameters and metrics, with minimal code changes. Which Vertex AI service should they use?

A.Vertex AI Prediction

B.Vertex AI Workbench

C.Vertex AI Metadata

D.Vertex AI Experiments with autologging

AnswerD

Autologging automatically records runs with minimal code.

Why this answer

Vertex AI Experiments with autologging captures parameters and metrics automatically when using the Vertex AI SDK or MLflow.

Practice this question →

9

MCQmedium

An organization uses Vertex AI Pipelines and wants to track the lineage of datasets, models, and metrics across pipeline runs. They need to query upstream and downstream dependencies of an artifact. Which service should they use?

A.Vertex AI Feature Store

B.Vertex AI Experiments

C.Vertex AI Model Registry

D.Vertex AI Metadata

AnswerD

Vertex AI Metadata provides a metadata store and lineage queries for artifacts, executions, and contexts.

Why this answer

Vertex AI Metadata stores ML metadata and supports lineage queries to track the provenance of artifacts across pipeline executions.

Practice this question →

10

Multi-Selectmedium

A company wants to implement a centralized model registry for governance. Which two features should they use? (Choose two.)

Select 2 answers

A.Vertex AI Feature Store

B.Vertex AI Model Registry

C.Vertex AI Experiments

D.Model versioning and aliases

E.Vertex AI Metadata

AnswersB, D

Central registry for model versioning and governance.

Practice this question →

11

MCQhard

A company uses Vertex AI Feature Store with an online store for low-latency serving. They observe high latency during peak hours. The feature values are small (< 1 KB each) and the workload is read-heavy. Which change would most effectively reduce latency?

A.Enable caching on the client side

B.Switch from Bigtable online store to Optimized online store

C.Use a larger machine type for Bigtable

D.Increase the number of Bigtable nodes

AnswerB

Optimized store provides lower latency for read-heavy patterns.

Why this answer

Switching from Bigtable online store to Optimized online store is recommended for read-heavy workloads with small feature values, offering lower latency at high QPS.

Practice this question →

12

MCQeasy

A team wants to use Vertex AI Workbench for collaborative notebook development. They need a persistent environment that can be stopped and restarted without losing installed packages and data. Which instance type should they choose?

A.User-managed notebooks

B.Managed notebooks

C.Colab Enterprise notebooks

D.Vertex AI Pipelines

AnswerA

User-managed notebooks are persistent and retain packages and data across stops/starts.

Why this answer

User-managed notebooks are persistent and retain customizations even when stopped, while managed notebooks are not persistent and lose customizations after stop.

Practice this question →

13

MCQeasy

A machine learning team wants to share features across multiple models to reduce training-serving skew and ensure consistency. Which Vertex AI service should they use?

A.Vertex AI Workbench

B.Vertex AI Model Registry

C.Vertex AI Feature Store

D.Vertex AI Experiments

AnswerC

Feature Store is designed for sharing and serving features consistently across training and serving.

Why this answer

Vertex AI Feature Store centralizes feature storage, ensuring the same features are used for training and serving, reducing training-serving skew.

Practice this question →

14

MCQeasy

A data scientist is using Vertex AI Workbench notebooks and wants to collaborate with team members in real-time on the same notebook. Which notebook type supports real-time collaboration?

A.Managed notebooks

B.JupyterLab on Compute Engine

C.Deep Learning VMs

D.User-managed notebooks

AnswerA

Managed notebooks provide real-time collaboration.

Why this answer

Managed notebooks in Vertex AI Workbench support real-time collaboration similar to Google Docs, while user-managed instances do not.

Practice this question →

15

Multi-Selecthard

A machine learning team needs to ensure that the same features used for training are used for serving in production to avoid training-serving skew. They use Vertex AI Feature Store. Which THREE actions should they take?

Select 3 answers

A.Enable point-in-time correct retrieval when creating training datasets

B.Use different feature views for training and serving to compare performance

C.Use the same feature view for both training data export and online serving

D.Export training data from the online store directly

E.Set up feature monitoring to detect drift in feature distributions

AnswersA, C, E

Avoids data leakage and ensures temporal consistency.

Why this answer

Using the same feature view for training and serving ensures consistency. Point-in-time correct retrieval prevents leakage. Feature monitoring detects drift that could indicate skew.

Practice this question →

16

MCQmedium

A data science team needs to share features across multiple ML models while ensuring consistency between training and serving. Which approach best achieves this?

A.Store features in a shared BigQuery dataset without versioning

B.Export features to CSV files shared via Cloud Storage

C.Use Vertex AI Feature Store to define and serve features for both training and online prediction

D.Each team maintains its own feature engineering code in separate pipelines

AnswerC

Centralised, consistent feature management.

Why this answer

Vertex AI Feature Store provides a central repository where features are defined once and reused across models, reducing training-serving skew.

Practice this question →

17

MCQmedium

A company uses Delta Lake on Dataproc for their data lake. They need to ensure ACID transactions and schema enforcement for data ingested from streaming sources. Which Delta Lake feature should they enable?

A.Delta Lake time travel

B.Schema enforcement

C.Delta Lake change data feed

D.Optimized write

AnswerB

Schema enforcement rejects writes with mismatched schemas, ensuring data quality.

Why this answer

Delta Lake provides ACID transactions, schema enforcement, and time travel. Schema enforcement is key for streaming data to prevent data quality issues.

Practice this question →

18

Multi-Selecteasy

A company wants to use DVC for data versioning alongside their ML code in Git. Which TWO statements about DVC are correct? (Select 2)

Select 2 answers

A.DVC uses a separate .dvc file to track data versions.

B.DVC can push data to remote storage like Google Cloud Storage.

C.DVC stores the actual data files in Git.

D.DVC only works with AWS S3 as remote storage.

E.DVC replaces Git for code versioning.

AnswersA, B

Each data file has a corresponding .dvc file.

Why this answer

DVC tracks data files by storing their hashes in Git and uses a remote storage for the actual data. It can integrate with cloud storage like GCS.

Practice this question →

19

Multi-Selectmedium

A company uses Vertex AI Pipelines for ML workflows. They want to standardize pipeline templates across teams to ensure consistency. Which TWO approaches should they use?

Select 2 answers

A.Use only pre-built components from Google's public repository

B.Share notebooks with pipeline code via Google Drive

C.Create reusable pipeline components using the Vertex AI SDK and store them in a shared repository

D.Define pipelines using YAML templates and store them in a version-controlled Git repository

E.Publish pipeline components as custom containers in Google Cloud Marketplace

AnswersC, D

Reusable components promote consistency.

Why this answer

Using the Vertex AI SDK to create reusable pipeline components and storing them in a shared repository ensures consistency. Publishing components to the Google Cloud Marketplace is not a standard approach.

Practice this question →

20

MCQhard

A team is training a model using historical data and wants to avoid data leakage when joining feature values from a feature store. The features include time-varying data like user activity counts. Which retrieval method should they use when creating a training dataset?

A.Retrieve the latest feature values for each entity

B.Aggregate features over all historical data

C.Use random sampling of feature values

D.Use point-in-time correct retrieval with timestamp matching

AnswerD

Point-in-time correct retrieval ensures features are fetched as of the timestamp of each training example, avoiding leakage.

Why this answer

Point-in-time correct retrieval joins features at the exact timestamp of each training row, ensuring no future data is used. This prevents data leakage. Other methods without timestamp handling introduce leakage.

Practice this question →

21

Multi-Selectmedium

A company wants to monitor features in Vertex AI Feature Store for drift over time. Which two services should they use? (Choose two.)

Select 2 answers

A.Vertex AI Feature Store monitoring

B.Cloud Logging

C.Vertex AI Model Monitoring

D.Vertex AI Experiments

E.Cloud Monitoring

AnswersA, E

Built-in monitoring calculates drift statistics.

Practice this question →

22

MCQmedium

An ML team wants to share feature definitions across multiple projects to reduce training-serving skew and ensure consistency. They currently store features in Cloud Storage and manually coordinate updates, leading to errors. Which Google Cloud service should they use to centrally manage and serve features for both training and online inference?

A.Cloud Data Catalog

B.Vertex AI Model Registry

C.Vertex AI Feature Store

D.Cloud Storage with versioning

AnswerC

Feature Store is designed for feature management and serving.

Why this answer

Vertex AI Feature Store centralizes feature management, providing an online store for low-latency serving and an offline store for training data retrieval, reducing training-serving skew.

Practice this question →

23

MCQhard

A team uses Vertex AI Feature Store with an online store for low-latency serving. They need to support frequent updates to features (e.g., every minute) and require high write throughput (thousands of writes per second). Which online store type should they choose?

A.Optimized online store

B.Firestore online store

C.Bigtable online store

D.Cloud SQL online store

AnswerC

Bigtable supports high write throughput and low-latency reads, ideal for frequent updates.

Why this answer

Bigtable online store is optimized for high write throughput and low-latency serving, suitable for frequently updated features. Optimized online store is better for read-heavy, static features.

Practice this question →

24

MCQeasy

An ML team wants to monitor feature drift in their production model. Which Vertex AI Feature Store capability should they use?

A.Feature views

B.Online store

C.Point-in-time retrieval

D.Feature monitoring (drift detection)

AnswerD

Feature monitoring tracks distribution changes.

Why this answer

Vertex AI Feature Store includes feature monitoring that can detect drift between training and serving data.

Practice this question →

25

MCQeasy

An ML team uses Vertex AI Pipelines and wants to automatically generate model cards documenting model purpose, evaluation results, and intended use. Which approach should they take?

A.Manually create a Google Doc and share it with the team.

B.Use Cloud Data Catalog to annotate the model artifact.

C.Write a custom Kubeflow Pipelines component that creates a BigQuery table with model metadata.

D.Use Vertex AI Model Registry to generate model cards automatically from model metadata.

AnswerD

Model Registry provides model card generation.

Why this answer

Vertex AI Model Registry supports model cards that can be populated programmatically via the SDK or Cloud Console.

Practice this question →

26

MCQhard

A data scientist is training a model using Vertex AI Experiments and wants to automatically log model parameters, metrics, and artifacts without modifying their training script. Which approach should they use?

A.Use the gcloud beta ai experiments autolog command before running the training script.

B.Wrap the training code in a Kubeflow Pipelines component that calls the Vertex AI Experiments API.

C.Build a custom container with the Vertex AI SDK and MLflow installed, then set the VERTEX_AI_AUTOLOG environment variable to 1.

D.Use the Vertex AI Python SDK to create a custom training job and manually log each parameter and metric in the training script.

AnswerC

Vertex AI supports autologging via MLflow when the environment variable is set.

Why this answer

Vertex AI Experiments supports autologging via the Vertex AI SDK when using MLflow or Keras callbacks. Specifying a custom container with the SDK installed and enabling autologging allows automatic logging without code changes.

Practice this question →

27

MCQeasy

A data scientist wants to automatically generate model documentation that includes model purpose, training data, evaluation results, and intended use. Which tool should they use?

A.Vertex AI Workbench

B.Vertex AI Experiment

C.Cloud Datalab

D.Model Cards in Vertex AI

AnswerD

Model Cards generate automated documentation from model metadata.

Why this answer

Model Cards in Vertex AI provide a standardized framework for documenting models, automatically populated with metadata from Model Registry.

Practice this question →

28

MCQmedium

A team is using Vertex AI Feature Store with an online store for low-latency serving. They notice increasing latency during peak hours. The feature data is updated frequently and requires strong consistency. Which online store type should they use?

A.Bigtable online store

B.Optimized online store

C.Cloud Spanner online store

D.Firestore online store

AnswerA

Bigtable online store is designed for high-throughput, low-latency serving with strong consistency, making it suitable for peak-hour loads.

Why this answer

Bigtable online store is recommended for high-throughput, low-latency, and strong consistency requirements.

Exam trap

Candidates often confuse the consistency model of the optimized online store. Contrary to a common misconception, the optimized online store actually provides strong consistency, not eventual consistency. The key differentiator for Bigtable is its ability to handle high throughput and low latency under peak loads, not consistency.

Practice this question →

29

MCQeasy

A data scientist wants to track machine learning experiments, including parameters, metrics, and artifacts, and compare runs. Which Vertex AI service should they use?

A.Vertex AI Metadata

B.Vertex AI Experiments

C.Vertex AI Feature Store

D.Vertex AI Model Registry

AnswerB

Correct service for experiment tracking and comparison.

Why this answer

Vertex AI Experiments is designed for tracking and comparing ML experiment runs, capturing parameters, metrics, and artifacts. It integrates with MLflow for autologging.

Practice this question →

30

Multi-Selecthard

An ML team uses Vertex AI Workbench managed notebooks and wants to version their notebook code and collaborate using Git. Which THREE steps are required to set up Git integration? (Select 3)

Select 3 answers

A.Configure Git credentials (username/email) using git config.

B.Create a Cloud Source Repositories mirror of the GitHub repo.

C.Install git in the notebook environment if not already present.

D.Enable the Vertex AI Notebooks API for Git sync.

E.Clone the repository using git clone in a notebook cell.

AnswersA, C, E

Required for committing.

Why this answer

To use Git in managed notebooks, you need to install git, configure credentials, and clone the repository.

Practice this question →

31

MCQmedium

A data science team wants to share a set of engineered features across multiple projects and teams to reduce training-serving skew and ensure consistency. They need low-latency serving (single-digit milliseconds) for online predictions and also need to retrieve historical feature values for training. Which approach should they take?

A.Use Vertex AI Feature Store to define features once, serve online predictions from the online store, and retrieve historical features from the offline store for training.

B.Create a shared BigQuery dataset where each team writes features; serve predictions by querying BigQuery synchronously.

C.Store features in Cloud Storage Parquet files and load them into BigQuery for training; serve predictions from a custom microservice that reads from Cloud Storage.

D.Use Cloud Memorystore (Redis) to store the latest feature values for low-latency serving; each team independently computes features and pushes to Redis.

AnswerA

Vertex AI Feature Store provides both online (low-latency) and offline serving, centralizes features, and supports point-in-time queries to avoid leakage.

Why this answer

Vertex AI Feature Store is designed exactly for this purpose: it centralizes feature definitions and serves features online (using an online store) and offline (using BigQuery). The online store provides low-latency serving, and point-in-time queries allow training data creation without data leakage. This reduces training-serving skew across teams.

Practice this question →

32

MCQeasy

Which Vertex AI service is used to track the lineage of ML pipeline components, artefacts, and executions?

A.Vertex AI Metadata

B.Vertex AI Model Registry

C.Vertex AI Feature Store

D.Vertex AI Experiments

AnswerA

Metadata is the ML metadata store for lineage.

Why this answer

Vertex AI Metadata (ML Metadata) stores and queries lineage information.

Practice this question →

33

MCQmedium

A machine learning engineer needs to deploy a model to an endpoint for real-time predictions. The model is registered in Vertex AI Model Registry. Which command should they use to create an endpoint and deploy the model with the alias 'champion'?

A.gcloud ai models upload --model=my-model --alias=champion --endpoint=my-endpoint

B.gcloud ai endpoints deploy-model --endpoint=my-endpoint --model=my-model --alias=champion

C.gcloud ai endpoints predict --model=my-model --alias=champion

D.gcloud ai endpoints create --model=my-model --alias=champion

AnswerB

This deploys the model version with alias 'champion' to the endpoint.

Why this answer

The correct command uses `gcloud ai endpoints deploy-model` with the model's resource name and alias.

Practice this question →

34

MCQmedium

A data engineer needs to version large datasets (multiple TB) in a Data Lake on Google Cloud. They require ACID transactions to ensure consistency when multiple jobs read/write concurrently. Which solution should they use?

A.Delta Lake on Dataproc

B.BigQuery table snapshots

C.DVC (Data Version Control)

D.Vertex AI Feature Store

AnswerA

Delta Lake provides ACID transactions on cloud storage, ideal for concurrent reads/writes on data lakes.

Why this answer

Delta Lake on Dataproc provides ACID transactions on cloud storage data lakes, enabling concurrent reads/writes with consistency.

Practice this question →

35

Multi-Selectmedium

A data science team collaborates using Vertex AI Workbench user-managed notebooks. They want to version control their notebook code and share it with team members. Which TWO tools should they use? (Choose 2)

Select 2 answers

A.Git integration in Vertex AI Workbench

B.Cloud Functions

C.Vertex AI Model Registry

D.Vertex AI Experiments

E.Cloud Source Repositories

AnswersA, E

Allows version control of notebooks directly in Workbench.

Why this answer

Git integration in Workbench allows version control, and sharing via Cloud Source Repositories or GitHub provides collaboration.

Practice this question →

36

MCQmedium

You are using DVC for data versioning in an ML project on Google Cloud. Your training data is stored in Cloud Storage. You want to track a new version of the dataset after preprocessing. Which DVC command should you use to register the changes?

A.dvc add data/processed

B.dvc push

C.dvc run -n preprocess

D.dvc commit

AnswerA

dvc add tracks the dataset and creates a .dvc file, versioning the data.

Why this answer

DVC tracks data versions via 'dvc add' which creates a .dvc file that points to the data in Cloud Storage. 'dvc run' is for pipelines, 'dvc push' uploads cached data to remote storage, and 'dvc commit' saves changes to DVC-tracked files after a pipeline run.

Practice this question →

37

Multi-Selectmedium

A data science team uses Vertex AI Experiments to compare multiple model training runs. They want to capture and compare hyperparameters, metrics, and code versions for each run. Which TWO steps should they take?

Select 2 answers

A.Use Cloud Logging to capture all training outputs

B.Store code versions in Cloud Storage and link them to experiments manually

C.Log hyperparameters and metrics using the Vertex AI SDK's experiment logging functions

D.Export experiment data to BigQuery for comparison

E.Integrate the training code with Git and use the commit hash as a run parameter

AnswersC, E

SDK functions allow logging to Experiments for comparison.

Why this answer

Using the Vertex AI SDK to log parameters and metrics enables comparison. Integrating with a version control system like Git ensures code version tracking.

Practice this question →

38

MCQmedium

An ML engineer needs to deploy a model to an endpoint and gradually shift traffic from the previous version (champion) to a new version (challenger) for A/B testing. How should they configure the endpoint?

A.Use a canary deployment with Cloud Run

B.Manually update the endpoint to point to the challenger after testing

C.Create a new endpoint for the challenger and route traffic via load balancer

D.Deploy both versions to the same endpoint and set traffic splitting

AnswerD

Vertex AI allows splitting traffic across deployed models.

Why this answer

Vertex AI endpoints support traffic splitting by assigning percentages to different model versions.

Practice this question →

39

MCQhard

A company uses Vertex AI Feature Store for feature engineering. They need to ensure point-in-time correctness to avoid data leakage during training. Which feature retrieval method should they use?

A.Use the `get_features` API without specifying a timestamp.

B.Use BigQuery to manually join features with a sliding window.

C.Use the offline store with point-in-time join using the `feature_view` with a timestamp column.

D.Use the online store to retrieve the latest feature values.

AnswerC

Point-in-time join ensures correct historical context, avoiding leakage.

Why this answer

Point-in-time retrieval in Vertex AI Feature Store allows fetching feature values as they existed at a specific timestamp, preventing leakage.

Practice this question →

40

Multi-Selectmedium

A company is implementing MLOps on Google Cloud and needs to manage model versions, assign aliases (e.g., 'champion' for production, 'challenger' for staging), store evaluation metrics alongside each model version, and deploy models to endpoints. Which service should they use? (Choose THREE that are part of the solution.)

Select 3 answers

A.Vertex AI Model Registry

B.Vertex AI Feature Store

C.Vertex AI Endpoints

D.Vertex AI Pipelines

E.Vertex AI Experiments

AnswersA, C, D

Provides model versioning, aliases (champion/challenger), and evaluation metrics storage.

Why this answer

Vertex AI Model Registry manages model versions, aliases, evaluation metrics, and deployment. Vertex AI Endpoints is the target for deployment. Vertex AI Pipelines can be used to automate the promotion and deployment process, but the question asks for the core service that provides versioning, aliases, metrics, and deployment.

Actually, the Model Registry itself handles aliases and metrics, and deployment to endpoints is done through the registry. Pipelines are optional but part of the MLOps workflow. However, the question asks for 'part of the solution' — the three key components are Model Registry, Endpoints, and Pipelines (or maybe Experiment? Let's adjust: Model Registry for versioning/aliases/metrics, Endpoints for serving, and Pipelines for automation).

Alternatively, consider that Metadata is also used for lineage. But the stem emphasizes 'manage model versions, assign aliases, store evaluation metrics, and deploy models to endpoints' — Model Registry does all that except actual deployment to endpoints (it deploys to endpoints). So the correct answer is Model Registry, Endpoints, and maybe Pipelines or Experiments.

But Experiments is not required for versioning. Given the options, the best three are: Vertex AI Model Registry (core), Vertex AI Endpoints (deployment target), and Vertex AI Pipelines (to orchestrate the deployment). However, note that Model Registry deploys to endpoints directly.

Let's choose a different combination: Model Registry, Endpoints, and maybe Metadata for lineage? But stem doesn't mention lineage. Let's stick with: Model Registry, Endpoints, and Pipelines (as a standard template). I'll keep it reasonable.

Practice this question →

41

MCQmedium

A machine learning team wants to implement champion/challenger model deployment. They have two model versions: v1 (champion) and v2 (challenger). They deploy both to the same endpoint with traffic splitting. How should they manage model versions in Vertex AI Model Registry to reflect this?

A.Upload both models without aliases. Use endpoint traffic splitting by model version ID.

B.Upload v1 with alias 'champion' and v2 with alias 'challenger'. Then deploy both to the endpoint with traffic split.

C.Use Vertex AI Experiments to designate champion/challenger.

D.Create two separate endpoints: one for champion and one for challenger.

AnswerB

Aliases like 'champion' and 'challenger' are used to identify models and manage traffic splitting.

Why this answer

Aliases in Model Registry allow labeling models as 'champion' and 'challenger' for easy identification and traffic routing.

Practice this question →

42

MCQhard

A team monitors features in Vertex AI Feature Store for drift. They want to set up automated alerts when a feature's distribution deviates significantly from the baseline. Which feature monitoring configuration should they use?

A.Enable feature monitoring on the feature group with drift threshold and notification channel.

B.Use Cloud Monitoring custom metrics and log-based alerts manually.

C.Use Vertex AI Experiments to compare distributions.

D.Export features to BigQuery and set up scheduled queries with alerts.

AnswerA

Feature monitoring in Feature Store supports drift detection and alerting.

Why this answer

Feature monitoring in Vertex AI Feature Store allows defining drift thresholds and alerting via Cloud Monitoring.

Practice this question →

43

MCQhard

A company uses BigQuery as their data warehouse. They want to version datasets for ML experiments and be able to query snapshots at specific points in time. Which approach is most cost-effective and requires minimal operational overhead?

A.Use BigQuery table clones to create copies of the data.

B.Export the table to Cloud Storage as Parquet and use DVC to version the files.

C.Use BigQuery time-travel (7-day window) to query historical data without snapshots.

D.Create BigQuery table snapshots at key milestones.

AnswerD

Snapshots are incremental and cost-effective.

Why this answer

BigQuery table snapshots are a built-in feature for point-in-time recovery and versioning at low cost (only storage changes).

Practice this question →

44

Multi-Selecthard

A team uses Vertex AI Feature Store with an online store for real-time predictions. They notice that the online store queries are taking longer than expected. Which TWO actions could improve online store performance? (Choose 2)

Select 2 answers

A.Use the offline store for serving predictions.

B.Reduce the number of features served by the online store.

C.Switch from Bigtable to Firestore online store.

D.Enable caching on the online store.

E.Increase the number of Bigtable nodes if using Bigtable online store.

AnswersB, E

Fewer features reduce data volume and improve latency.

Why this answer

For Bigtable online store, increasing nodes improves throughput; for optimized online store, decreasing number of features reduces load. Caching is not a built-in feature.

Practice this question →

45

MCQmedium

An organization uses Vertex AI Workbench user-managed notebooks and wants to enable collaboration where multiple data scientists can edit the same notebook simultaneously. Which configuration should they use?

A.Use Git to branch and merge notebooks, with a CI/CD pipeline to resolve conflicts.

B.Share the user-managed notebook instance by giving multiple users IAM roles to access the same JupyterLab instance.

C.Store the notebook in Cloud Storage and ask users to edit sequentially using gsutil cp.

D.Use a managed notebook instance (Colab Enterprise) and share the notebook via a link.

AnswerD

Managed notebooks support real-time collaboration.

Why this answer

Vertex AI Workbench user-managed notebooks do not support real-time collaboration. Managed notebooks (now upgraded to Colab Enterprise) support simultaneous editing via integration with Colab.

Practice this question →

46

MCQhard

You need to create a reproducible snapshot of a BigQuery table as of a specific timestamp for ML model training. The snapshot should be queryable without copying the entire dataset. Which BigQuery feature should you use?

A.BigQuery export to Cloud Storage as Parquet

B.BigQuery time travel (FOR SYSTEM_TIME AS OF)

C.CREATE TABLE AS SELECT with WHERE clause

D.BigQuery table snapshots

AnswerD

Snapshots provide a point-in-time, queryable copy that persists beyond 7 days.

Why this answer

BigQuery table snapshots create a read-only copy of a table at a specific point in time, queryable without additional storage costs for the data. Time travel queries access historical data but are limited to 7 days, and copying to a new table duplicates storage.

Practice this question →

47

MCQmedium

You are setting up feature monitoring in Vertex AI Feature Store to detect drift in a numerical feature. The monitoring job should run daily and alert if the Jensen-Shannon divergence exceeds 0.1. Which configuration should you use?

A.Configure feature monitoring in the feature view with a drift threshold of 0.1 using Jensen-Shannon divergence

B.Use BigQuery scheduled queries to compare distributions and send alerts

C.Set up a Cloud Composer DAG to compute drift and publish to Cloud Monitoring

D.Enable model monitoring on the Vertex AI endpoint to detect drift

AnswerA

Vertex AI Feature Store allows setting drift thresholds per feature view, using methods like JS divergence.

Why this answer

Feature monitoring in Vertex AI Feature Store uses monitoring_config with drift detection via statistical tests like Jensen-Shannon divergence. The correct approach is to set the drift threshold in the feature view's monitoring configuration.

Practice this question →

48

MCQmedium

A team uses Vertex AI Workbench managed notebooks. They want to version control their notebook files and collaborate using Git. What is the best way to integrate Git?

A.Use Cloud Source Repositories only

B.Use the built-in Git integration in Vertex AI Workbench managed notebooks

C.Use gcloud source repos clone inside the terminal

D.Manually download notebooks and upload to GitHub via browser

AnswerB

Direct integration simplifies version control.

Why this answer

Vertex AI Workbench managed notebooks support direct Git integration via the user interface, allowing clone, commit, and push operations.

Practice this question →

49

MCQmedium

A data scientist needs to retrieve training data from Vertex AI Feature Store that exactly matches the feature values as they were at a specific historical timestamp to avoid label leakage. Which feature view configuration should they use?

A.Enable point-in-time retrieval on the feature view.

B.Use the offline store without point-in-time and rely on data ordering.

C.Use the online store with a timestamp filter.

D.Create a new feature view with only historical data.

AnswerA

Point-in-time retrieval ensures no future data leaks.

Why this answer

Point-in-time retrieval is a feature of Vertex AI Feature Store that returns feature values as of a specified timestamp.

Practice this question →

50

MCQmedium

A team is using Delta Lake on Dataproc for their data lake with ACID transactions. They want to version data for ML experiments and roll back to a previous version if needed. Which Delta Lake feature should they use?

A.Delta Lake streaming

B.Delta Lake schema enforcement

C.Delta Lake time travel

D.Delta Lake optimization (Z-order)

AnswerC

Time travel allows querying historical versions.

Why this answer

Delta Lake provides time travel via VERSION AS OF or TIMESTAMP AS OF.

Practice this question →

51

MCQeasy

What is the primary benefit of using a centralised model registry in MLOps?

A.Governance and version control of models

B.Better hyperparameter tuning

C.Faster model training

D.Automatic model deployment

AnswerA

Registry manages model versions, metadata, and aliases.

Why this answer

A centralised model registry provides governance, versioning, and lineage tracking, enabling collaboration and auditability.

Practice this question →

52

MCQmedium

A data science team wants to version control their datasets along with code using Git. They need a tool that integrates with Git and tracks changes to large data files. Which tool should they use?

A.BigQuery table snapshots

B.Delta Lake

C.Git LFS

D.DVC

AnswerD

DVC integrates with Git and manages data versioning, pipelines, and experiments.

Why this answer

DVC (Data Version Control) is designed to version large datasets and models alongside Git, storing data in remote storage and metadata in Git.

Practice this question →

53

MCQhard

An organization needs to implement MLOps with standardized pipeline templates across multiple teams. Which Vertex AI feature should they use to create reusable pipeline components?

A.Vertex AI Pipelines

B.Vertex AI Experiments

C.Vertex AI Metadata

D.Vertex AI Workbench

AnswerA

Pipelines support reusable components and templates.

Why this answer

Vertex AI Pipelines allows creation of reusable pipeline components and templates that can be shared across teams, enabling standardization.

Practice this question →

54

MCQmedium

A team uses Vertex AI Workbench notebooks for collaborative model development. They want to ensure that code changes are version-controlled, that multiple data scientists can work on the same notebook without conflicts, and that the environment is reproducible across team members. Which approach should they take?

A.Use a shared JupyterLab instance launched on a single VM; data scientists connect simultaneously.

B.Use Vertex AI Workbench managed notebooks with Git integration and a custom container image for environment reproducibility.

C.Store notebooks in Cloud Storage and share the bucket; each user edits their own copy.

D.Use Vertex AI Pipelines to run all code as pipelines; data scientists only view results in notebooks.

AnswerB

Git integration provides version control; custom containers ensure consistent environments across team members.

Why this answer

Vertex AI Workbench supports Git integration for version control. Using a user-managed notebook with a custom container image (Docker) ensures environment reproducibility. This allows multiple data scientists to clone the same repo, make changes in isolation, and merge code.

The managed notebook option also supports Git, but user-managed provides flexibility for custom environments.

Practice this question →

55

MCQmedium

An ML team wants to implement data versioning for large datasets stored in Google Cloud Storage. They need to track changes over time and reproduce previous data states. Which tool is most appropriate?

A.Cloud Storage Object Versioning

B.BigQuery table snapshots

C.Git LFS

D.DVC

AnswerD

DVC provides data versioning and pipeline reproducibility.

Why this answer

DVC (Data Version Control) is designed specifically for versioning large datasets and ML models, working with cloud storage.

Practice this question →

56

Multi-Selecteasy

Which THREE are valid uses of Vertex AI Metadata?

Select 3 answers

A.Record the execution of a pipeline step

B.Track which dataset was used to train a model

C.Query upstream sources of a model

D.Deploy a model to an endpoint

E.Store hyperparameter values of an experiment run

AnswersA, B, C

Executions are tracked in Metadata.

Why this answer

Vertex AI Metadata tracks lineage of artefacts, executions, and contexts across pipelines.

Practice this question →

57

MCQmedium

A team is building a fraud detection model that requires joining real-time transaction features with historical user features. They need to ensure that the training data does not use future information (data leakage). Which Vertex AI Feature Store capability should they use?

A.Online store serving with Bigtable

B.Feature store time travel

C.Point-in-time correct join

D.Feature monitoring for drift

AnswerC

This ensures historical consistency between features and labels.

Why this answer

Point-in-time correct join ensures that at training time, each row is joined with the feature value as it existed at that specific point in time, preventing leakage from future feature values.

Practice this question →

58

Multi-Selecthard

A team uses Vertex AI Pipelines and wants to track lineage of artifacts and executions. Which three resources should they use? (Choose three.)

Select 3 answers

A.Artifacts

B.Vertex AI Experiments

C.Vertex AI Metadata

D.Model Registry

E.Executions

AnswersA, C, E

Artifacts represent data or model outputs in the lineage graph.

Practice this question →

59

MCQeasy

A team of data scientists is collaborating on notebooks in Vertex AI Workbench. They need to use Git for version control and share notebooks with real-time editing. Which type of Workbench instance should they choose?

A.User-managed notebooks

B.Vertex AI Pipelines

C.Managed notebooks

D.Dataproc Jupyter notebooks

AnswerA

User-managed notebooks are JupyterLab instances with Git integration and real-time collaboration capabilities.

Why this answer

User-managed notebooks are full JupyterLab instances that support Git integration for version control. However, they are single-user instances and do not provide real-time collaborative editing. Managed notebooks are serverless and support multi-user collaboration, but the question assumes user-managed notebooks are required for Git integration.

Practice this question →

60

MCQmedium

A data science team uses Vertex AI Experiments to track training runs. They want to automatically log parameters, metrics, and artifacts for all runs with minimal code changes. Which approach should they take?

A.Manually log each parameter and metric using `aiplatform.log_metrics()` after each training step.

B.Use MLflow autologging by calling `mlflow.autolog()` before the training code and wrap the training script with `mlflow.start_run()`.

C.Enable Vertex AI Experiments autologging by setting `autolog=True` in the experiment run context.

D.Use TensorBoard with tf.keras.callbacks.TensorBoard to log metrics.

AnswerB

MLflow autologging captures parameters, metrics, and artifacts automatically when used with Vertex AI Experiments.

Why this answer

Vertex AI Experiments supports autologging via the MLflow library. By wrapping the training code with mlflow.start_run() and enabling autolog, all parameters, metrics, and artifacts are captured automatically.

Practice this question →

61

MCQhard

An ML engineer trained a model and registered it in Vertex AI Model Registry. They want to assign the alias 'champion' to the best-performing version for production deployment. Which gcloud command should they use?

A.gcloud ai models versions describe --model=MODEL_ID --version=VERSION_ID

B.gcloud ai models upload --model-id=MODEL_ID --display-name=champion

C.gcloud ai endpoints deploy-model --model=MODEL_ID --alias=champion

D.gcloud ai models versions update --model=MODEL_ID --version=VERSION_ID --update-aliases=champion

AnswerD

This updates the version's aliases.

Why this answer

The correct command is 'gcloud ai models versions update' with the --update-aliases flag. This command updates a specific model version's aliases, allowing you to assign or remove aliases like 'champion' to denote the best-performing version for production. Option A only describes a version, option B uploads a new model, and option C deploys a model to an endpoint, none of which assign aliases to a version.

Practice this question →

62

MCQhard

A company uses Vertex AI Pipelines to orchestrate ML workflows. After a pipeline run, they want to query the lineage of a particular model artifact to find out which dataset and hyperparameters were used to produce it. Which API method should they use?

A.projects.locations.metadataStores.artifacts.queryArtifactLineageSubgraph

B.projects.locations.metadataStores.artifacts.get

C.projects.locations.metadataStores.contexts.addContextArtifactsAndExecutions

D.projects.locations.metadataStores.executions.queryExecutionInputsAndOutputs

AnswerA

Correct method to get lineage subgraph.

Why this answer

Vertex AI Metadata allows querying lineage via the lineagesubgraph method to retrieve upstream and downstream artifacts.

Practice this question →

63

Multi-Selecthard

A team is operationalizing a machine learning pipeline using Vertex AI. They want to automatically track experiment runs, log model parameters and metrics, and store model artifacts for reproducibility. They also need to capture lineage between pipeline components (e.g., which dataset and hyperparameter tuning job produced a model). Which TWO services should they use together to achieve this? (Choose two.)

Select 2 answers

A.Vertex AI Model Registry

B.Vertex AI Feature Store

C.Vertex AI Metadata

D.Vertex AI Experiments

E.Vertex AI Workbench

AnswersC, D

Captures lineage between pipeline components, artifacts, and executions.

Why this answer

Vertex AI Experiments provides experiment tracking (parameters, metrics, artifacts). Vertex AI Metadata captures lineage between pipeline components, artifacts, and executions. Together, they enable full reproducibility and traceability.

Practice this question →

64

MCQmedium

An ML engineer has a model trained in Vertex AI and wants to deploy it to an endpoint with autoscaling and traffic splitting for canary testing. They have the model artifact stored in Vertex AI Model Registry with alias 'champion'. What is the correct sequence of steps?

A.Upload model to registry, create endpoint, then deploy model to endpoint with traffic split.

B.Create endpoint, upload model to registry, then deploy model to endpoint with traffic split.

C.Create endpoint, deploy model directly from Cloud Storage, then add traffic split.

D.Upload model to registry, then create endpoint and deploy in one command using gcloud ai endpoints deploy-model.

AnswerA

Correct order: model first, then endpoint, then deploy.

Why this answer

The standard sequence: upload model to registry, create endpoint, deploy model to endpoint with traffic split.

Practice this question →

65

MCQmedium

A team is building ML pipelines with Vertex AI. They want to reuse standard pipeline components across teams and enforce governance. What approach should they take?

A.Use Vertex AI Pipelines with pre-built and custom components organized in a component registry.

B.Store pipeline definitions in a shared Cloud Storage bucket and copy them manually.

C.Use Cloud Composer to orchestrate ad-hoc scripts.

D.Have each team build their own pipelines independently.

AnswerA

Vertex AI component registry enables sharing and governance of reusable components.

Why this answer

Standardized pipeline templates (components) in Vertex AI Pipelines allow reuse and governance across teams.

Practice this question →

66

MCQhard

A company trains a model using features from Vertex AI Feature Store. They notice training-serving skew because the feature values used at training time differ from those served online. How should they address this?

A.Use the same online store for both training and serving

B.Disable caching in the online store

C.Enable feature monitoring to detect drift

D.Use point-in-time correct retrieval from the offline store for training data

AnswerD

This ensures training uses the exact feature values as of the label time, eliminating skew.

Why this answer

Point-in-time correct retrieval ensures that the feature values used in training correspond to the exact same point in time as the label, preventing data leakage and skew.

Practice this question →

67

Multi-Selectmedium

A company is implementing MLOps with Vertex AI. They need to ensure that only approved models can be deployed to production. Which TWO practices should they adopt?

Select 2 answers

A.Use a single endpoint for all models without versioning

B.Allow any user with Vertex AI User role to deploy models

C.Use Vertex AI Model Registry aliases (e.g., 'champion') to mark production-ready models

D.Store all models in Cloud Storage without registry

E.Enable Vertex AI Continuous Evaluation with manual approval gate

AnswersC, E

Aliases provide a clear designation of production models.

Why this answer

Using aliases to designate champion and enabling manual approval in Continuous Evaluation ensures governance.

Practice this question →

68

MCQmedium

A data scientist is using Vertex AI Experiments to track training runs. They want to automatically log all hyperparameters, metrics, and model artifacts without modifying their training code. Which approach should they use?

A.Use Cloud Logging to capture training logs and parse them into experiments

B.Manually log parameters and metrics using the Vertex AI SDK in the training script

C.Run training in Vertex AI Workbench and manually save metrics to a BigQuery table

D.Enable autologging with MLflow and use Vertex AI Experiments as the tracking server

AnswerD

Autologging automatically captures parameters, metrics, and artifacts without modifying training code.

Why this answer

Vertex AI Experiments supports autologging with MLflow integration. By using the MLflow tracking API or Vertex AI's autologging, parameters and metrics are captured without code changes.

Practice this question →

69

Multi-Selectmedium

A company wants to implement a central model governance strategy using Vertex AI. They need to track model lineage, store evaluation metrics, and manage model versions across teams. Which THREE Vertex AI services should they use? (Choose 3)

Select 3 answers

A.Vertex AI Metadata

B.Vertex AI Model Registry

C.Vertex AI Workbench

D.Vertex AI Experiments

E.Vertex AI Feature Store

AnswersA, B, D

Tracks artifact lineage across the ML lifecycle.

Why this answer

Vertex AI Metadata tracks lineage, Model Registry stores models and versions with evaluation metrics, and Experiments captures run parameters and metrics.

Practice this question →

70

MCQeasy

An ML engineer needs to deploy a model from Vertex AI Model Registry to an endpoint. The model has multiple versions. They want to designate one version as the 'champion' for production traffic. How should they do this?

A.Use the 'champion' alias

B.Set the version as default in the registry

C.Use the 'production' alias

D.Deploy the model directly to an endpoint without alias

AnswerA

The champion alias is used to designate the production version.

Why this answer

Vertex AI Model Registry supports aliases; assigning the 'champion' alias to a model version allows routing production traffic to that version.

Practice this question →

71

MCQhard

A team uses Vertex AI Metadata to track pipeline runs. They need to identify all artifacts that were generated by a particular pipeline execution. Which API method should they use?

A.List executions and then list artifacts separately

B.Use the lineage query API with the execution ID

C.Create a context and query executions

D.Query artifacts by filter on execution ID

AnswerB

The lineage query returns upstream/downstream artifacts for the given execution.

Why this answer

The lineage query API can retrieve upstream and downstream artifacts for a given execution, allowing tracing of artifacts from a specific pipeline run.

Practice this question →

72

Multi-Selecthard

A team wants to monitor features in Vertex AI Feature Store for drift. Which TWO configurations are required?

Select 2 answers

A.Enable feature monitoring on the feature view

B.Set up a BigQuery sink for monitoring results

C.Configure an alerting channel (e.g., email) for drift notifications

D.Create a Cloud Scheduler job to trigger monitoring

E.Deploy a monitoring model to an endpoint

AnswersA, C

Required to compute statistics and detect drift.

Why this answer

Enable feature monitoring on the feature view and configure an alerting channel (e.g., email, Pub/Sub) for drift notifications.

Practice this question →

73

MCQmedium

An organisation uses Delta Lake on Dataproc to manage a data lake for ML training. They need ACID transactions for concurrent reads and writes. Which file format does Delta Lake use as the underlying storage?

A.Apache Parquet

B.Apache ORC

C.CSV

D.Apache Avro

AnswerA

Delta Lake stores data in Parquet files with a transaction log for ACID transactions.

Why this answer

Delta Lake uses Parquet as the base file format, adding a transaction log for ACID properties. Avro and ORC are not used by Delta Lake, and CSV does not support ACID.

Practice this question →

74

MCQeasy

A team wants to enforce governance and compliance for all ML models across the organisation. They need a centralised repository that tracks model versions, deployment history, and evaluation metrics. Which service should they use?

A.Cloud Storage

B.Vertex AI Feature Store

C.Vertex AI Experiments

D.Vertex AI Model Registry

AnswerD

Model Registry provides centralised model governance with versioning, metrics, and deployment tracking.

Why this answer

Vertex AI Model Registry is the centralised repository for model versioning, lineage, metrics, and deployment management, providing governance and compliance.

Practice this question →

75

MCQmedium

A company uses Vertex AI Pipelines to train and deploy models. They want to automatically generate model documentation that includes model details, intended use, and evaluation results. What should they use?

A.Vertex AI Explanations

B.Vertex AI Metadata

C.Model Cards

D.Vertex AI Model Registry with custom metadata

AnswerC

Model Cards provide automated, standardized documentation.

Why this answer

Model Cards are a standardized format for model documentation, and Vertex AI supports automated generation of model cards.

Practice this question →

Page 1 of 2 · 79 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Pmle Collaboration Data questions.

Start 20-question session

CCNA Pmle Collaboration Data Questions — Page 1 of 2 | Courseiva