Free PMLE Automating and Orchestrating ML Pipelines Practice Questions (2026)

Q: How many Automating and Orchestrating ML Pipelines questions are on the PMLE exam?

The Automating and Orchestrating ML Pipelines domain is one of the weighted domains on the PMLE exam. The Courseiva question bank has 89 practice questions for this domain.

Q: How can I practice Automating and Orchestrating ML Pipelines questions for PMLE?

Click any of the 89 questions listed on this page to see the full question and explanation, or use the session launcher to start a focused practice session of 10, 20, 30 or 50 questions drawn only from the Automating and Orchestrating ML Pipelines domain.

Practice Automating and Orchestrating ML Pipelines questions

10Q 20Q 30Q 50Q

All PMLE Automating and Orchestrating ML Pipelines questions (89)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

A data scientist creates a custom Python function component for a Vertex AI pipeline using the Kubeflow Pipelines SDK v2. The component takes a string parameter 'input_text' and outputs a Metrics artifact. The scientist wants to include a lightweight Python function without building a container. Which code snippet correctly defines this component?

A machine learning engineer is building a Vertex AI pipeline that uses a pre-built Google Cloud Pipeline Components (GCPC) to train a custom model. Which component should the engineer use to submit a custom training job to Vertex AI?

A team has a Vertex AI pipeline that includes a container component for data preprocessing. The team notices that the component is re-executed every time the pipeline runs, even when the inputs and code haven't changed. They want to leverage pipeline caching to avoid redundant executions. What should they do to enable caching for this component?

A machine learning team uses Vertex AI Pipelines to orchestrate their training pipeline. They want to trigger the pipeline automatically in response to new data arriving in a Cloud Storage bucket, and also support a scheduled run every day at 6 AM. Which combination of services should they use to achieve both event-driven and schedule-based triggers?

A company is using Vertex AI Pipelines to automate model retraining. They have a component that creates a BigQuery table with training data. To ensure idempotency, the component should check if the table already exists and recreate it if necessary. What is the best practice for passing data between pipeline components?

A data engineer wants to orchestrate a complex workflow that includes running a Vertex AI pipeline, then a BigQuery job, and finally a Dataflow pipeline. The workflow must handle dependencies, retries, and monitoring. Which Google Cloud service is most suitable for this orchestration?

A machine learning engineer is building a Vertex AI pipeline that uses a pre-built AutoML Tables component to train a classification model. The pipeline also includes a conditional step that deploys the model to an endpoint only if the evaluation metrics exceed a threshold. Which KFP feature should be used to implement the conditional deployment?

A team uses Cloud Build to automatically trigger a Vertex AI pipeline when changes are pushed to the model code repository. They have a cloudbuild.yaml file that builds a container image and submits the pipeline. However, they want to run the pipeline only if the commit includes changes to the 'training/' directory. Which Cloud Build configuration option should be used to filter the trigger?

A data scientist wants to create a Vertex AI pipeline component that uses a custom container image stored in Artifact Registry. The component should accept a dataset artifact as input and output a model artifact. Which component type should they use?

A machine learning engineer needs to pass a large dataset between two components in a Vertex AI pipeline. What is the recommended way to pass this data?

A team is using Vertex AI Pipelines to deploy a model. They have a component that evaluates the model and produces a ClassificationMetrics artifact. The pipeline should deploy the model only if the precision is greater than 0.9. They use dsl.If to check the metric. However, the condition always evaluates to False. What is the most likely cause?

A company wants to implement continuous training for their ML model. The pipeline should be triggered when new training data arrives in Cloud Storage, and after training, the model should be automatically deployed to a staging endpoint if evaluation metrics pass a threshold. They also need to detect skew between training data and serving data. Which two services should they use for skew detection?

A machine learning team uses Vertex AI Pipelines to run a multi-step training pipeline. They want to implement a continuous delivery (CD) process where a model is automatically promoted from staging to production only if it passes an evaluation gate. Which TWO actions should they include in their CI/CD pipeline? (Choose two.)

A company runs a Vertex AI pipeline that uses a container component to preprocess data. The component downloads a large file from a public URL and saves the output to Cloud Storage. The pipeline fails intermittently with a 'timeout' error. Which THREE steps should the team take to improve reliability? (Choose three.)

A data scientist is creating a Vertex AI pipeline using the Kubeflow Pipelines SDK v2. Which TWO statements about pipeline parameters are correct? (Choose two.)

A data science team wants to build a Vertex AI pipeline that trains a model, evaluates it, and conditionally deploys it if the accuracy exceeds 0.9. They want to use the Kubeflow Pipelines SDK v2. Which construct allows them to conditionally execute the deployment step based on the evaluation metric?

A machine learning engineer needs to schedule a Vertex AI pipeline to run daily at midnight. Which approach should they use?

You are building a Vertex AI pipeline using the KFP SDK v2. One component processes a large dataset and outputs a metrics artifact. You notice that the component is being cached even when the dataset changes, because the component code and image remain the same. How can you force the component to always re-execute when the dataset changes?

A team wants to implement continuous training for their ML model. The pipeline should be triggered when new training data arrives in a Cloud Storage bucket. Which combination of services should they use?

You are using KFP SDK v2 to define a pipeline. You need to pass a large dataset between components. What is the best practice for passing data?

A machine learning engineer wants to use a pre-built Google Cloud Pipeline Components (GCPC) to train a model using Vertex AI. Which component should they use?

You are developing a Vertex AI pipeline that runs multiple parallel training jobs with different hyperparameters, then collects their results and selects the best model. Which KFP SDK v2 construct should you use to run the parallel training tasks?

A company uses Cloud Composer to orchestrate their ML workflows. They have an Airflow DAG that runs a Vertex AI pipeline, then a BigQuery query, then a Dataflow job. The DAG is failing because the Vertex AI pipeline takes longer than the Airflow task timeout. What is the best way to handle this?

You are defining a Python function component in KFP SDK v2. Which decorator should you use?

A team wants to implement continuous delivery for their ML models. They have a pipeline that trains a model and evaluates it. If the evaluation metrics exceed a threshold, the model should be deployed to a staging endpoint, and after manual approval, to production. Which approach should they use?

You are designing a Vertex AI pipeline that includes a container component. The component needs to use a custom container image that is stored in Artifact Registry. How should you specify the container image in the component definition?

You have a Vertex AI pipeline that trains a model and outputs a Model artifact. You want to register this model in the Vertex AI Model Registry. Which pre-built Google Cloud Pipeline Components component should you use?

You are building a CI/CD pipeline for an ML model using Cloud Build. When code is pushed to the main branch, you want to automatically build a training image, run a Vertex AI pipeline, and if the model evaluation passes, deploy it to a staging endpoint. Which two components are essential for this CI/CD pipeline?

You are tasked with building a robust ML pipeline that must be idempotent and handle data skew between training and serving. Which three practices should you implement?

You need to orchestrate a complex ML workflow that involves multiple Vertex AI pipelines, BigQuery jobs, and Dataflow pipelines. The workflow must handle dependencies, retries, and monitoring. Which two services are best suited for this orchestration?

A data science team wants to deploy a ML pipeline on Vertex AI Pipelines that includes a component to train a model using a custom container. The component should be reusable across different pipelines and accept hyperparameters as inputs. Which approach should they take?

An ML engineer is building a pipeline on Vertex AI Pipelines and wants to pass a dataset artifact from one component to another without incurring additional cost for intermediate storage. How should they define the input and output types?

A team runs a Vertex AI pipeline daily. They notice that a component that downloads a file from a public URL always executes even when the URL and parameters haven't changed. They want to avoid unnecessary re-execution and reduce costs. What should they do?

An organization wants to trigger a Vertex AI pipeline whenever new data arrives in a Cloud Storage bucket. Which approach should they use?

An ML engineer is using Cloud Composer (Airflow) to orchestrate a ML workflow. They need to run a Vertex AI pipeline as one of the tasks in the DAG. Which Airflow operator should they use?

A team has a pipeline that trains a model and then evaluates it. They want to conditionally deploy the model to a staging endpoint only if evaluation metrics exceed a threshold. Which KFP feature should they use?

What is the primary benefit of using pipeline caching in Vertex AI Pipelines?

An engineer needs to compile a Kubeflow Pipeline defined in Python to a JSON format that can be run on Vertex AI Pipelines. Which command should they use?

A company has a CI/CD pipeline that retrains a model every time new training data is available. They want to automatically deploy the new model to production only if it passes a set of evaluation tests on a staging environment. Which approach best implements this?

Which of the following is a best practice when designing idempotent pipeline components in Vertex AI?

An ML team wants to run a hyperparameter tuning job on Vertex AI using a pre-built pipeline component. Which component should they use?

What is the purpose of the 'importer' component in Vertex AI Pipelines?

Your organization wants to automate the retraining of a model when new data is available and also on a weekly schedule. Which TWO services would you use together to achieve this? (Choose two.)

A team is designing a ML pipeline that includes training, evaluation, and conditional deployment. They want to use Vertex AI Pipelines. Which THREE concepts should they use? (Choose three.)

An ML engineer is building a continuous training pipeline that retrains a model when new data arrives. The pipeline should also detect skew between training and serving data. Which TWO Google Cloud services should they use? (Choose two.)

A data scientist wants to define a lightweight Python function component in Vertex AI Pipelines using Kubeflow Pipelines SDK v2. Which decorator should be applied to the function to make it a pipeline component?

An ML engineer wants to containerize a custom training script and use it as a component in a Vertex AI Pipeline. The component should accept a dataset URI and a learning rate parameter, and output a trained model artifact. Which approach should the engineer use to define the component?

A team notices that a Vertex AI Pipeline step re-executes every time the pipeline runs, even though its inputs and code have not changed. They want to enable caching for this component to avoid redundant computation. However, caching is currently disabled globally. Which configuration change will enable caching for that specific component?

An ML engineer needs to trigger a Vertex AI Pipeline on a recurring schedule, every 24 hours, to retrain a model with the latest data. Which approach should they use to set up this schedule?

A company wants to implement continuous delivery (CD) for ML models, where a model is automatically deployed to a staging environment and only promoted to production after passing an evaluation gate. Which combination of GCP services is BEST suited for orchestrating this CD pipeline?

An ML engineer is building a pipeline that includes a step to run a BigQuery query and pass the results to the next step. They want to use a pre-built Google Cloud Pipeline Component for BigQuery. Which component should they use to execute a query and output the results to a destination table?

A Vertex AI Pipeline contains a task that may produce outputs that are not always needed. The engineer wants to conditionally execute downstream tasks only if a specific artifact is produced. Which KFP SDK v2 construct allows the engineer to implement this conditional execution?

An organization wants to use Cloud Composer (Airflow) to orchestrate a machine learning workflow that includes running a Vertex AI Pipeline, followed by a BigQuery job, and then a Dataflow pipeline. What is the primary advantage of using Cloud Composer for this orchestration?

An ML engineer is designing a pipeline that should run only when new training data arrives in a Cloud Storage bucket. Which event-driven approach should they use to trigger the Vertex AI Pipeline?

A team is implementing CI/CD for ML using Cloud Build. They want to trigger a training pipeline in Vertex AI whenever a new model code is pushed to the main branch of the repository. Which Cloud Build configuration should they use to achieve this?

In a Vertex AI Pipeline, a component produces a Metrics artifact that includes an evaluation metric. The engineer wants to use this metric value as a condition to decide whether to deploy the model. However, the metric value is stored in the artifact's metadata and not directly as a pipeline parameter. How can the engineer pass the metric value to a downstream conditional task?

An ML engineer is building a pipeline component that takes a dataset URI and a model URI as inputs, and outputs a classification metrics artifact. Which KFP SDK v2 type should the output artifact be annotated with?

An organization is building a continuous training (CT) pipeline that retrains a model whenever one of the following conditions is met: (1) new training data is available in Cloud Storage, (2) it's the first day of the month, or (3) the model's performance degrades below a threshold. Which TWO mechanisms should they combine to trigger the pipeline?

A company is using Vertex AI Pipelines for ML workflows. They want to implement best practices for idempotent components and data passing. Which THREE practices should they adopt?

An ML engineer is creating a Vertex AI Pipeline that includes a loop to train multiple models in parallel on different hyperparameter sets. Which TWO KFP SDK v2 constructs can be used to implement this parallel execution?

A data science team wants to build a machine learning pipeline on Vertex AI Pipelines that preprocesses data, trains a model, and evaluates it. They need to ensure that components can be reused across multiple pipelines and that outputs from one component can be passed as inputs to another. Which approach should they take?

A machine learning engineer has a Vertex AI pipeline that trains a model. The pipeline uses caching to avoid re-running components that have not changed. After updating the training code, the engineer notices that the pipeline still uses cached outputs from the previous run. What could be the reason?

A team is building a CI/CD pipeline for an ML model. They want to automatically trigger a Vertex AI pipeline for retraining whenever new training data arrives in a Cloud Storage bucket, but only if a specific Pub/Sub notification is published by a data ingestion process. Which approach meets these requirements with minimal operational overhead?

A machine learning engineer is using Vertex AI Pipelines and wants to run a custom Python function as a component. They need to pass a dataset artifact from a previous component and output a model artifact. Which decorator should they use to define the component in the Kubeflow Pipelines SDK v2?

A company uses Cloud Composer to orchestrate a nightly ML workflow that includes running a Vertex AI pipeline, querying BigQuery, and running a Dataflow job. The Airflow DAG must run only if the previous day's Dataflow job succeeded. Which Airflow concept should they use to implement this dependency?

A machine learning team wants to implement a continuous delivery pipeline for their ML models using Vertex AI Pipelines. The pipeline should automatically deploy a model to a staging endpoint after evaluation passes, and then after manual approval, promote it to production. Which strategy should they use to manage model versions in the Vertex AI Model Registry?

A data scientist is defining a Vertex AI pipeline and needs to include a step that imports a pre-existing model from Cloud Storage into the pipeline as an artifact. Which Kubeflow Pipelines SDK v2 component should they use?

A company uses Vertex AI Pipelines to train models on a daily schedule. The pipeline includes a component that runs a BigQuery query to extract features. The team wants to ensure that if the BigQuery component fails due to transient network errors, the pipeline automatically retries it. How can they configure retries in Vertex AI Pipelines?

A machine learning engineer needs to create a pipeline that runs a custom container component on Vertex AI. The container expects a Cloud Storage path as input and outputs a model artifact. Which component type should they define using the Kubeflow Pipelines SDK v2?

A team is building a continuous training pipeline that retrains a model when new data arrives. They want to detect data drift between the training dataset and the serving data. Which approach should they integrate into the pipeline to compare the distributions of the two datasets?

An organization wants to trigger a Vertex AI pipeline whenever a new commit is pushed to the main branch of their Cloud Source Repository. The pipeline should retrain and evaluate the model. Which service should they use to detect the push event and start the pipeline?

A machine learning engineer is building a pipeline with Vertex AI Pipelines and wants to pass a large dataset between components without copying it to the container's memory. What is the best practice for passing data between pipeline components?

A data science team is designing a Vertex AI pipeline that includes a loop over a list of hyperparameter sets. They want to run training jobs in parallel for each hyperparameter set and then collect the results for comparison. Which two Kubeflow Pipelines SDK v2 features should they use? (Choose two.)

A company wants to implement a CI/CD pipeline for their ML models using Vertex AI. They need to automatically retrain the model when new data arrives, but only if the model performance on a validation set has degraded by more than 5% compared to the current production model. Which three services or components should they incorporate into the automated pipeline? (Choose three.)

A machine learning team uses Vertex AI Pipelines for model training. They want to implement a conditional step that runs additional evaluation if the model accuracy exceeds 0.9, otherwise it runs a data augmentation component. Which two Kubeflow Pipelines SDK v2 constructs can they use to achieve this? (Choose two.)

A machine learning engineer wants to define a lightweight pipeline component that runs custom Python code without building a container image. Which KFP SDK feature should they use?

An organization runs a Vertex AI pipeline that includes a model evaluation step. Team members want to reuse previously computed evaluation metrics when re-running the pipeline with unchanged code and hyperparameters. Which feature should they enable?

A data science team uses Cloud Composer to orchestrate ML workflows. They need to trigger a Vertex AI pipeline after a BigQuery data load completes, and then run a Dataflow job. Which Airflow operator should they use to launch the Vertex AI pipeline?

A machine learning pipeline includes a conditional branch: if model accuracy exceeds 0.95, deploy to production; otherwise, send a notification. Which KFP SDK feature allows implementing this logic within the pipeline definition?

A company wants to automatically retrain their model every night at 2 AM using Vertex AI Pipelines. Which approach should they use to trigger the pipeline on a schedule?

A team develops a pipeline that trains a model and evaluates it. They want to pass the test accuracy (a float) from the evaluation component to a subsequent deployment component. Which KFP SDK type should the evaluation component output be annotated with?

An ML pipeline runs on Vertex AI and includes a component that uses a third-party library not available in the default Python environment. The team wants to avoid building a custom container image. Which approach should they use?

A company uses Vertex AI Pipelines for ML training. They want to implement continuous training triggered by new data arrival. Which two Google Cloud services should they use to achieve this? (Choose two.)

A pipeline uses the Google Cloud Pipeline Components to perform AutoML training and batch prediction. Which two components from the GCPC library should they use? (Choose two.)

An ML pipeline must run a set of preprocessing tasks for each data shard in parallel. Which KFP SDK features should they use to implement this? (Choose two.)

A team wants to implement CI/CD for their ML pipeline using Cloud Build. They want to automatically compile and deploy the pipeline when code is pushed to the main branch. Which three steps should they include in the Cloud Build configuration? (Choose three.)

A data science team uses Cloud Composer to orchestrate a complex ML workflow. They need to run a Vertex AI pipeline and then a BigQuery query conditionally based on the pipeline's output. Which Airflow features should they use? (Choose two.)

A pipeline includes a component that produces a model artifact. The team wants to automatically detect skew between the training data distribution and the serving data distribution. Which three best practices should they implement? (Choose three.)

An organization wants to implement continuous delivery for their ML model. After a new model is trained and evaluated, they want to automatically deploy it to a staging endpoint, run validation tests, and if passed, promote to production. Which two components should they include in their delivery pipeline? (Choose two.)

Practice all 89 Automating and Orchestrating ML Pipelines questions

Other PMLE exam domains

Collaborating Within and Across Teams to Manage Data and Models Serving and Scaling Models Monitoring ML Solutions Architecting Low-Code ML Solutions Scaling Prototypes into ML Models Collaborating to manage data and models Solving business challenges with ML

Frequently asked questions

What does the Automating and Orchestrating ML Pipelines domain cover on the PMLE exam?

The Automating and Orchestrating ML Pipelines domain covers the key concepts tested in this area of the PMLE exam blueprint published by Google Cloud. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all PMLE domains — no account required.

How many Automating and Orchestrating ML Pipelines questions are in the PMLE question bank?

The Courseiva PMLE question bank contains 89 questions in the Automating and Orchestrating ML Pipelines domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Automating and Orchestrating ML Pipelines for PMLE?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Automating and Orchestrating ML Pipelines questions for PMLE?

Yes — the session launcher on this page draws questions exclusively from the Automating and Orchestrating ML Pipelines domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your PMLE domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Free forever · Every certification included