PMLE · topic practice

Automating and Orchestrating ML Pipelines practice questions

Practise Google Professional Machine Learning Engineer Automating and Orchestrating ML Pipelines practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security
20 questionsDomain: Automating and Orchestrating ML Pipelines

What the exam tests

What to know about Automating and Orchestrating ML Pipelines

Automating and Orchestrating ML Pipelines questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common Automating and Orchestrating ML Pipelines exam traps

  • Answering from memory before reading the full scenario.
  • Missing a constraint such as cost, availability, security, scope or command context.
  • Choosing a broad answer when the question asks for the most specific fix.
  • Ignoring why the wrong options are tempting.

Practice set

Automating and Orchestrating ML Pipelines questions

20 questions · select your answer, then reveal the explanation

Question 1mediummultiple choice
Study the full Python automation breakdown →

A data scientist creates a custom Python function component for a Vertex AI pipeline using the Kubeflow Pipelines SDK v2. The component takes a string parameter 'input_text' and outputs a Metrics artifact. The scientist wants to include a lightweight Python function without building a container. Which code snippet correctly defines this component?

A machine learning engineer is building a Vertex AI pipeline that uses a pre-built Google Cloud Pipeline Components (GCPC) to train a custom model. Which component should the engineer use to submit a custom training job to Vertex AI?

A team has a Vertex AI pipeline that includes a container component for data preprocessing. The team notices that the component is re-executed every time the pipeline runs, even when the inputs and code haven't changed. They want to leverage pipeline caching to avoid redundant executions. What should they do to enable caching for this component?

A machine learning team uses Vertex AI Pipelines to orchestrate their training pipeline. They want to trigger the pipeline automatically in response to new data arriving in a Cloud Storage bucket, and also support a scheduled run every day at 6 AM. Which combination of services should they use to achieve both event-driven and schedule-based triggers?

A company is using Vertex AI Pipelines to automate model retraining. They have a component that creates a BigQuery table with training data. To ensure idempotency, the component should check if the table already exists and recreate it if necessary. What is the best practice for passing data between pipeline components?

A data engineer wants to orchestrate a complex workflow that includes running a Vertex AI pipeline, then a BigQuery job, and finally a Dataflow pipeline. The workflow must handle dependencies, retries, and monitoring. Which Google Cloud service is most suitable for this orchestration?

A machine learning engineer is building a Vertex AI pipeline that uses a pre-built AutoML Tables component to train a classification model. The pipeline also includes a conditional step that deploys the model to an endpoint only if the evaluation metrics exceed a threshold. Which KFP feature should be used to implement the conditional deployment?

A team uses Cloud Build to automatically trigger a Vertex AI pipeline when changes are pushed to the model code repository. They have a cloudbuild.yaml file that builds a container image and submits the pipeline. However, they want to run the pipeline only if the commit includes changes to the 'training/' directory. Which Cloud Build configuration option should be used to filter the trigger?

A data scientist wants to create a Vertex AI pipeline component that uses a custom container image stored in Artifact Registry. The component should accept a dataset artifact as input and output a model artifact. Which component type should they use?

A machine learning engineer needs to pass a large dataset between two components in a Vertex AI pipeline. What is the recommended way to pass this data?

A team is using Vertex AI Pipelines to deploy a model. They have a component that evaluates the model and produces a ClassificationMetrics artifact. The pipeline should deploy the model only if the precision is greater than 0.9. They use dsl.If to check the metric. However, the condition always evaluates to False. What is the most likely cause?

A company wants to implement continuous training for their ML model. The pipeline should be triggered when new training data arrives in Cloud Storage, and after training, the model should be automatically deployed to a staging endpoint if evaluation metrics pass a threshold. They also need to detect skew between training data and serving data. Which two services should they use for skew detection?

A machine learning team uses Vertex AI Pipelines to run a multi-step training pipeline. They want to implement a continuous delivery (CD) process where a model is automatically promoted from staging to production only if it passes an evaluation gate. Which TWO actions should they include in their CI/CD pipeline? (Choose two.)

A company runs a Vertex AI pipeline that uses a container component to preprocess data. The component downloads a large file from a public URL and saves the output to Cloud Storage. The pipeline fails intermittently with a 'timeout' error. Which THREE steps should the team take to improve reliability? (Choose three.)

A data scientist is creating a Vertex AI pipeline using the Kubeflow Pipelines SDK v2. Which TWO statements about pipeline parameters are correct? (Choose two.)

A data science team wants to build a Vertex AI pipeline that trains a model, evaluates it, and conditionally deploys it if the accuracy exceeds 0.9. They want to use the Kubeflow Pipelines SDK v2. Which construct allows them to conditionally execute the deployment step based on the evaluation metric?

A machine learning engineer needs to schedule a Vertex AI pipeline to run daily at midnight. Which approach should they use?

You are building a Vertex AI pipeline using the KFP SDK v2. One component processes a large dataset and outputs a metrics artifact. You notice that the component is being cached even when the dataset changes, because the component code and image remain the same. How can you force the component to always re-execute when the dataset changes?

A team wants to implement continuous training for their ML model. The pipeline should be triggered when new training data arrives in a Cloud Storage bucket. Which combination of services should they use?

You are using KFP SDK v2 to define a pipeline. You need to pass a large dataset between components. What is the best practice for passing data?

Free account

Track your progress over time

Create a free account to save your results and see which topics improve across sessions.

Focused Automating and Orchestrating ML Pipelines sessions

Start a Automating and Orchestrating ML Pipelines only practice session

Every question in these sessions is drawn from the Automating and Orchestrating ML Pipelines domain — nothing else.

Related practice questions

Related PMLE topic practice pages

Move into related areas when this topic feels solid.

Frequently asked questions

What does the PMLE exam test about Automating and Orchestrating ML Pipelines?
Automating and Orchestrating ML Pipelines questions test whether you can apply the concept in context, not just recognise a definition.
How should I use these practice questions?
Select your answer before revealing the explanation. Then read why each option is right or wrong — this active recall approach builds retention far faster than re-reading notes.
Can I practise just Automating and Orchestrating ML Pipelines questions in a focused session?
Yes — the session launcher on this page draws every question from the Automating and Orchestrating ML Pipelines domain. Use a 10-question session first to gauge your baseline, then move to 20 or 30 once the weak spots are clear.
Where can I practise other PMLE topics?
Use the topic links above to move to related areas, or go back to the PMLE question bank to see all topics.
Are these real exam questions or dumps?
These are original practice questions written to test the same concepts the PMLE exam covers. They are not copied from any real exam or dump site.
Google Professional Machine Learning Engineer Automating and Orchestrating ML Pipelines Practice Questions with Explanations | Courseiva