Back to Google Cloud Generative AI Leader Generative AI Leader questions

Scenario-based practice

Hard Difficulty Questions

Practise Google Cloud Generative AI Leader Generative AI Leader practice questions — original exam-style scenarios covering every exam domain, with detailed explanations, wrong-answer analysis, and common exam traps.

20
scenario questions
Generative AI Leader
exam code
Google Cloud
vendor

Scenario guide

How to approach hard difficulty questions

These are the questions most candidates get wrong. They require connecting multiple concepts, reading tricky output, or knowing edge-case behaviour that isn't on most study cards. Practising them trains you to operate under uncertainty — a necessary skill on the real exam.

Quick answer

Hard Difficulty Questions questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Related practice questions

Related Generative AI Leader topic practice pages

Scenario questions usually connect to one or more exam topics. Use these links to review the underlying concepts behind the scenario.

Practice set

Practice scenarios

Question 1hardmulti select
Full question →

Which THREE considerations are critical when deploying a generative AI model using Vertex AI Endpoints for a latency-sensitive application? (Choose THREE.)

Question 2hardmultiple choice
Read the full NAT/PAT explanation →

A company is deploying a generative AI model for customer support. They want to reduce hallucinations while maintaining fluency. They have a large dataset of previous support conversations. Which strategy should they prioritize?

Question 3hardmulti select
Full question →

A company is considering monetizing a generative AI-powered product. Which two business models are most common and viable?

Question 4hardmultiple choice
Full question →

An organization uses a fine-tuned model for medical diagnosis and must comply with HIPAA. Which measure is essential when deploying the model on Vertex AI?

Question 5hardmultiple choice
Full question →

Refer to the exhibit. A user with this IAM role tries to deploy a model to a Vertex AI Endpoint but fails. What is the most likely reason?

Exhibit

{
  "bindings": [
    {
      "role": "roles/aiplatform.user",
      "members": [
        "user:user@example.com"
      ]
    }
  ]
}
Question 6hardmulti select
Full question →

A company is fine-tuning a Gemma model using Vertex AI. They observe that the model overfits. Which TWO actions should they take to mitigate overfitting?

Question 7hardmulti select
Full question →

Which THREE of the following are potential risks when deploying generative AI?

Question 8hardmultiple choice
Full question →

A company is deploying a chatbot that uses a foundation model. They want to minimize latency for user queries. Which action is most effective?

Question 9hardmultiple choice
Full question →

A research team is training a large language model from scratch using TPUs on Google Cloud. Which storage solution provides the highest throughput for training data?

Question 10hardmultiple choice
Full question →

A company has a large dataset of proprietary documents and wants to build a Q&A system using a foundation model without exposing the documents to the model. Which approach is most appropriate?

Question 11hardmultiple choice
Full question →

Refer to the exhibit. An administrator creates this IAM policy for a Vertex AI project. What is the effect of this policy?

Exhibit

{
  "bindings": [
    {
      "role": "roles/aiplatform.user",
      "members": ["user:alice@example.com"]
    },
    {
      "role": "roles/aiplatform.customCodeModelAdmin",
      "members": ["user:bob@example.com"]
    }
  ]
}
Question 12hardmultiple choice
Full question →

Refer to the exhibit. This JSON describes a Vertex AI endpoint with a deployed model. Which statement about scaling is true?

Exhibit

{
  "name": "projects/my-project/locations/us-central1/endpoints/123456",
  "displayName": "my-endpoint",
  "deployedModels": [
    {
      "id": "789",
      "model": "projects/my-project/locations/us-central1/models/456",
      "dedicatedResources": {
        "machineSpec": {
          "machineType": "n1-standard-2",
          "acceleratorType": "NVIDIA_TESLA_T4",
          "acceleratorCount": 1
        },
        "minReplicaCount": 1,
        "maxReplicaCount": 3
      },
      "automaticResources": null
    }
  ]
}
Question 13hardmultiple choice
Full question →

A large enterprise runs a generative AI solution serving millions of daily inference requests. To reduce costs, they propose using serverless endpoints (Vertex AI Prediction) with a custom container, but they notice high latency during cold starts. Which strategy best addresses this problem while minimizing cost?

Question 14hardmultiple choice
Full question →

A company is evaluating the ROI of a generative AI project. Which metric is most appropriate?

Question 15hardmultiple choice
Full question →

A financial services firm is developing a GenAI application for investment advice. They need to ensure regulatory compliance. Which business strategy should they prioritize?

Question 16hardmultiple choice
Full question →

A company is deploying a Gemini 1.0 Ultra model for a code generation assistant. They have set up Vertex AI Model Evaluation with a custom evaluation dataset to measure pass@1 accuracy. The initial evaluation shows 65% pass@1. They want to improve to 80% without collecting more training data. They have already attempted basic prompt engineering (e.g., 'write correct code') with limited improvement. Which approach is most likely to achieve the desired improvement?

Question 17hardmultiple choice
Full question →

A global e-commerce company is using Vertex AI to build a generative AI chatbot for customer support. The chatbot is powered by the Gemini 1.5 Pro model and uses a vector search index for retrieval-augmented generation (RAG) over product documentation. The company has deployed the application in four regions (us-central1, europe-west4, asia-east1, and australia-southeast1) using a multi-region deployment with a global endpoint. The application is critical and requires high availability with a target latency of under 500ms for the RAG pipeline. Recently, users in Australia are experiencing inconsistent latency spikes, with response times exceeding 2 seconds during peak hours. The team suspects that the issue is related to the vector search index's replication and serving configuration. The index has 10 million embeddings with a dimension of 768. It is stored in a single regional bucket in us-central1, and the vector search index endpoint is deployed in all four regions with the same deployed index ID. The team is using the default configuration for index updates and serving. Which action should the team take to resolve the latency issue for Australian users?

Question 18hardmultiple choice
Full question →

A large enterprise is deploying a generative AI-powered code assistant for their developers. The solution uses Vertex AI with a fine-tuned Codey model. The security team requires that all prompts and responses be logged for audit purposes, but the logs must not contain sensitive information such as API keys or passwords. The operations team is concerned about high latency during peak usage. You need to design a solution that meets security requirements without compromising performance. Which approach should you take?

Question 19hardmultiple choice
Full question →

A large enterprise runs a production application that uses the Gemini API on Vertex AI for real-time content moderation. They are experiencing occasional 429 (Too Many Requests) errors during peak hours. Their current quota is 1000 requests per minute (RPM) and they are hitting around 950 RPM on average, with spikes up to 1050. They have already implemented exponential backoff and retry logic. They need to reduce the error rate without reducing the quality of moderation. Which additional measure should they take?

Question 20hardmultiple choice
Full question →

An MLOps engineer wants to implement continuous evaluation of a generative model in production. Which Vertex AI component should they use?

These Generative AI Leader practice questions are part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style Generative AI Leader questions with detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics.