ACEChapter 54 of 101Objective 3.1

Cloud Run Jobs

This chapter covers Cloud Run Jobs, a serverless batch computing service on Google Cloud. For the ACE exam, Cloud Run Jobs falls under Domain 'Deploy and Implement' (Objective 3.1: Deploy and implement Cloud Run services). While Cloud Run Services are more common, Jobs appear in 2–4% of exam questions, often testing your understanding of when to use Jobs vs Services, execution configurations, and integration with Cloud Scheduler or Eventarc. Mastering this topic ensures you can design cost-effective, scalable batch workloads without managing infrastructure.

25 min read

Intermediate

Updated May 31, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Test what I know Look up key terms

Cloud Run Jobs as Assembly Line Batches

Imagine a factory assembly line where workers assemble custom furniture. Normally, each worker (container) picks up a piece (request), assembles it, and hands it off — this is Cloud Run Services, handling continuous requests. Now, the factory manager needs to run a one-time batch task: apply varnish to all 1000 chairs produced today. Instead of disrupting the assembly line, the manager sets up a separate, temporary station with a varnish sprayer. The station starts, processes all 1000 chairs (one after another, or in parallel), and then shuts down automatically. This is a Cloud Run Job. The job has a job configuration (the sprayer setup), runs to completion (all chairs varnished), and terminates. Unlike the assembly line (Service), it does not listen for new requests — it just runs its task and stops. If the varnish runs out (task fails), the manager can restart the job from the beginning or from the last failed chair (retries). The factory floor can run multiple such temporary stations simultaneously (parallelism) without affecting the main line. Cloud Run Jobs are perfect for batch processing, data migration, or periodic reports — tasks that have a clear start and end.

How It Actually Works

What are Cloud Run Jobs?

Cloud Run Jobs is a Google Cloud managed compute service for executing batch workloads that run to completion. Unlike Cloud Run Services, which continuously listen for HTTP requests and scale to zero when idle, Jobs are designed for one-off or scheduled tasks such as data processing, ETL pipelines, image or video transcoding, database migrations, and report generation. Each Job runs a container image to completion, then terminates. Jobs can be configured to run multiple parallel tasks (called 'task executions') within a single job execution, allowing you to process a large dataset in parallel.

How Cloud Run Jobs Work Internally

A Cloud Run Job consists of a job configuration (container image, CPU, memory, environment variables, parallelism, task count, retries, etc.) and executions. When you run a job, an execution is created, which spawns one or more tasks (instances of the container). Each task runs the container image with the specified resource limits. The job succeeds only if all tasks complete successfully. If a task fails, the job can be configured to retry that task up to a maximum number of attempts. Tasks run in parallel up to the parallelism setting; if parallelism is set to 1, tasks run sequentially.

Internally, Cloud Run Jobs uses the same underlying infrastructure as Cloud Run Services — Knative on Google Kubernetes Engine (GKE) with a sandbox for security. However, Jobs do not have an HTTP endpoint; they are triggered via the Google Cloud Console, gcloud CLI, Cloud Scheduler, Eventarc, or Workflows. The execution environment is ephemeral: once all tasks complete (or fail), the execution state is stored for up to 30 days (default retention), after which it is deleted unless you set a custom retention policy.

Key Components, Values, Defaults, and Timers

Task count: Number of tasks to run. Default is 1. Maximum is 10,000 (soft limit; can be increased via quota request).

Parallelism: Maximum number of tasks to run concurrently. Default is 1 (sequential). Maximum is 100 (soft limit).

CPU: Default is 1 vCPU. Range: 0.25 to 8 vCPUs (in 0.25 increments).

Memory: Default is 512 MiB. Range: 128 MiB to 32 GiB (in 128 MiB increments).

Max retries: How many times to retry a failed task. Default is 3. Set to 0 for no retries.

Task timeout: Maximum duration per task. Default is 10 minutes. Maximum is 24 hours.

Execution environment: Default is the first generation (g1). Second generation (g2) is available and provides better CPU/memory performance.

Execution retention: Number of days to keep execution records. Default is 30 days. Minimum is 1 day, maximum is 365 days.

VPC connectivity: Can be configured to use VPC egress (direct VPC or Serverless VPC Access) for accessing private resources.

Environment variables: Can be set at job level (shared across all tasks) or per task (via overrides).

Cloud Storage volumes: Can mount Cloud Storage buckets as volumes (using gcsfuse).

Secrets: Can be injected from Secret Manager as environment variables or volumes.

Configuration and Verification Commands

To create a job from a container image:

gcloud run jobs create JOB_NAME --image IMAGE_URI --region REGION

To set parallelism and task count:

gcloud run jobs create JOB_NAME --image IMAGE_URI --parallelism 5 --tasks 100

To run a job:

gcloud run jobs execute JOB_NAME

To view executions:

gcloud run jobs executions list --job JOB_NAME

To view logs for a specific execution:

gcloud logging read "resource.type=cloud_run_job AND resource.labels.job_name=JOB_NAME"

To update a job configuration (e.g., memory):

gcloud run jobs update JOB_NAME --memory 2Gi

How Cloud Run Jobs Interact with Related Technologies

Cloud Scheduler: Can trigger a job on a schedule (cron). The scheduler sends a POST request to the Cloud Run Jobs API.

Eventarc: Can trigger a job in response to events (e.g., Cloud Storage object finalize). Eventarc sends the event to a Workflows or directly to the job.

Workflows: Can orchestrate multiple jobs, adding conditional logic and error handling.

Cloud Tasks: Can enqueue tasks that trigger jobs (via HTTP target).

Cloud Logging and Monitoring: Jobs automatically emit logs and metrics (task duration, success/failure count). You can create alerts on job failures.

Secret Manager: Securely inject secrets as environment variables or mounted volumes.

VPC: Jobs can access private resources via VPC egress or Serverless VPC Access.

Important Exam Details

Cloud Run Jobs are not invoked via HTTP. They are triggered by API calls or events.

Jobs are not always-on; they run and terminate. You do not pay for idle time (only for execution time).

Jobs support parallelism but not concurrency (each task runs one instance of the container).

Jobs can be canceled mid-execution; tasks that are running will be sent a SIGTERM and given a grace period (default 30 seconds) to shut down.

Jobs can be deleted; this does not affect past executions (they are retained for the retention period).

The execution environment (g1 vs g2) affects performance: g2 offers 40% better CPU performance and 2x memory bandwidth.

Task index is available as an environment variable (CLOUD_RUN_TASK_INDEX) so each task knows its index (0 to task_count-1). This is useful for sharding work.

Cloud Run Jobs are regional resources. You must specify a region when creating a job.

IAM roles: roles/run.jobs.runner allows executing jobs; roles/run.jobs.viewer allows viewing; roles/run.admin allows full management.

Common Pitfalls

Confusing Services and Jobs: Services are for HTTP request handling; Jobs are for batch processing. The exam may present a scenario where a company runs a one-time data migration — the correct answer is to use a Job, not a Service.

Forgetting to set parallelism: If you have 1000 tasks but parallelism defaults to 1, they run sequentially and take a long time. The exam may test your ability to optimize by increasing parallelism.

Overlooking execution retention: By default, execution records are kept for 30 days. If you need them longer, you must set --execution-retention.

Ignoring task timeout: If a task takes longer than the timeout, it is killed. Default is 10 minutes; you can increase up to 24 hours.

Assuming Jobs can be invoked via HTTP: They cannot. You must use the API, gcloud, or an event trigger.

Example: Batch Image Processing Job

Suppose you have 10,000 images in Cloud Storage and need to resize them. You create a Job with:

Container image that reads a file path from an environment variable, resizes it, and writes back.

Task count = 10,000 (one per image).

Parallelism = 100 (max).

Each task gets its index via CLOUD_RUN_TASK_INDEX, which maps to a specific image.

The job runs, completes, and you pay only for the compute time used.

Example: Scheduled Report Generation

Use Cloud Scheduler to trigger a Job every day at 2 AM. The Job runs a container that queries BigQuery, generates a PDF, and uploads it to Cloud Storage. The Job runs to completion, then terminates. You do not pay for idle time between executions.

Walk-Through

Define the Job Configuration

First, you define the job configuration including the container image, CPU, memory, environment variables, task count, parallelism, retries, timeout, and execution retention. This is done via `gcloud run jobs create` or the Console. The configuration is stored as a Cloud Run Job resource. You can also set VPC egress and mount Cloud Storage volumes. The job is regional, so choose a region close to your data or users.

Trigger the Job Execution

A job execution is triggered manually via `gcloud run jobs execute`, via Cloud Scheduler (cron), Eventarc (event), Workflows, or the API. When triggered, Cloud Run creates an execution resource. The execution immediately starts launching tasks up to the parallelism limit. Each task is a container instance. The execution state is 'RUNNING'.

Tasks Run to Completion

Each task runs the container image with the specified resource limits. Tasks read environment variables (including `CLOUD_RUN_TASK_INDEX`) to determine their work. They perform the batch operation (e.g., process a file, query a database). Tasks can write logs to Cloud Logging. If a task succeeds, it exits with code 0. If it fails (non-zero exit code), the task is retried up to the max retries count. After all retries are exhausted, the task is marked as failed.

Monitor Execution Progress

You can monitor the execution via the Console, gcloud, or Cloud Monitoring. The execution shows the number of tasks succeeded, failed, and running. You can also view logs per task. If the execution is taking too long, you can cancel it (sends SIGTERM to all running tasks). After cancellation, the execution is marked as 'CANCELLED'.

Execution Completes and Cleanup

When all tasks have completed (success or failure), the execution ends. The overall execution status is 'SUCCEEDED' if all tasks succeeded, or 'FAILED' if any task failed after retries. The execution record is retained for the configured retention period (default 30 days). You can delete the job or execution records manually. After retention, records are automatically deleted.

What This Looks Like on the Job

Enterprise Scenario 1: ETL Pipeline for Data Warehousing

A financial services company runs a nightly ETL job that extracts transaction data from on-premises databases (via VPN), transforms it (aggregation, cleansing), and loads it into BigQuery. They use Cloud Run Jobs with a container that runs a Python script using Apache Beam. The job is triggered by Cloud Scheduler at 2 AM. Task count is set to 50 (one per data source), parallelism to 10. CPU is set to 4 vCPUs, memory 8 GiB. The job typically runs for 20 minutes. If a task fails, it retries up to 3 times. The company benefits from no server management and pays only for execution time. A common misconfiguration is forgetting to set VPC egress for on-premises connectivity, causing the job to fail with timeout errors.

Enterprise Scenario 2: Video Transcoding Service

A media company receives user-uploaded videos (up to 1 GB each) and needs to transcode them to multiple formats (HLS, DASH). They use Eventarc to trigger a Cloud Run Job when a new video is uploaded to Cloud Storage. The job container runs FFmpeg. Each job execution processes one video (task count = 1). Parallelism is not needed. The job timeout is set to 30 minutes (videos can be long). The job writes output to a different bucket. They also use Cloud Tasks for retries with exponential backoff. A common issue is hitting the 24-hour timeout limit if videos are extremely long; they split videos into segments and process them in parallel using multiple tasks.

Scenario 3: Database Migration

A SaaS company is migrating from a legacy MySQL database to Cloud Spanner. They use a Cloud Run Job to run a migration script that reads from MySQL, transforms schema, and writes to Spanner. The job has task count = 1 (single-threaded migration). They set a high timeout (24 hours) because the database is 500 GB. They mount a Cloud Storage bucket for logs. They also use Secret Manager for database credentials. A common pitfall is not setting enough memory for the container (the migration tool may cache large datasets). They initially set 512 MiB and the job failed with OOM; they increased to 4 GiB and it succeeded.

How ACE Actually Tests This

The ACE exam tests Cloud Run Jobs under Objective 3.1 (Deploy and implement Cloud Run services). While the objective title says 'services', it includes Jobs. Expect 2-3 questions that differentiate between Cloud Run Services and Jobs, or test job configuration parameters.

What the Exam Specifically Tests

When to use a Job vs a Service: If the workload is a batch process that runs to completion (e.g., data processing, migration), use a Job. If it needs to handle HTTP requests and scale to zero, use a Service.

Execution parameters: Know defaults: tasks=1, parallelism=1, max retries=3, timeout=10 minutes, CPU=1, memory=512 MiB, execution retention=30 days.

Parallelism and task count: Understand that parallelism limits concurrent tasks, not total tasks. If tasks=1000 and parallelism=10, at most 10 run at once.

Trigger mechanisms: Jobs can be triggered by Cloud Scheduler, Eventarc, Workflows, gcloud, API. They cannot be triggered by HTTP requests.

IAM roles: roles/run.jobs.runner to execute; roles/run.jobs.viewer to view; roles/run.admin for full control.

Common Wrong Answers

'Use a Cloud Run Service for batch processing': Wrong because Services are designed for HTTP request handling. The correct answer is a Job.

'Set concurrency to 80 for parallel tasks': Concurrency is a Service parameter (max concurrent requests per container). For Jobs, use parallelism.

'Jobs can be invoked via HTTP': False. Jobs do not have an endpoint.

'Execution retention defaults to 7 days': Wrong; it's 30 days.

Specific Numbers and Terms That Appear

Default parallelism: 1

Default max retries: 3

Default timeout: 10 minutes

Maximum timeout: 24 hours

Maximum tasks: 10,000 (soft limit)

Maximum parallelism: 100 (soft limit)

Execution retention: 30 days (default)

Environment variable CLOUD_RUN_TASK_INDEX: from 0 to task_count-1

Edge Cases

Zero retries: If you set max retries to 0, any task failure immediately fails the job.

Cancelling a job: Running tasks receive SIGTERM; they have 30 seconds to shut down gracefully.

Execution environment: g2 generation is available but not default. The exam may ask about performance differences.

VPC egress: Jobs can use direct VPC or Serverless VPC Access. The exam may test which is appropriate for private IPs.

How to Eliminate Wrong Answers

If the scenario mentions 'HTTP endpoint' or 'serve web traffic', it's a Service, not a Job.

If the scenario mentions 'batch', 'process files', 'run to completion', it's a Job.

If the question asks about 'concurrency', it's likely a distractor for Services; Jobs use 'parallelism'.

If the question asks about 'max instances', that's a Service setting; Jobs don't have that concept.

Always read the scenario carefully: is the workload continuous (Service) or one-off/scheduled (Job)? That's the key differentiator.

Key Takeaways

Cloud Run Jobs are for batch workloads that run to completion, not for HTTP request handling.

Default parallelism is 1; default max retries is 3; default timeout is 10 minutes; default execution retention is 30 days.

Maximum tasks per job is 10,000 (soft limit); maximum parallelism is 100 (soft limit).

Jobs can be triggered by Cloud Scheduler, Eventarc, Workflows, gcloud, or API — not by HTTP.

Each task receives the environment variable CLOUD_RUN_TASK_INDEX (0-based) for sharding work.

IAM roles: roles/run.jobs.runner to execute, roles/run.jobs.viewer to view, roles/run.admin for full management.

Execution records are retained for a configurable period (1-365 days, default 30).

Jobs support VPC egress for accessing private resources, and can mount Cloud Storage volumes.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Cloud Run Service

Designed for HTTP request handling

Always listening for traffic (scales to zero when idle)

Supports concurrency (multiple requests per container)

Billed per request and per container time (including idle if min instances > 0)

Has a URL endpoint

Cloud Run Job

Designed for batch processing that runs to completion

Runs only when triggered, then terminates

Supports parallelism (multiple tasks per execution)

Billed only for execution time (no idle cost)

No HTTP endpoint; triggered via API, Scheduler, or Eventarc

Watch Out for These

Mistake

Cloud Run Jobs can handle HTTP requests like Cloud Run Services.

Correct

Jobs do not have an HTTP endpoint. They are designed for batch workloads that run to completion. To handle HTTP requests, you must use a Cloud Run Service.

Mistake

Parallelism and task count are the same thing.

Correct

Task count is the total number of tasks to run. Parallelism is the maximum number of tasks that run concurrently. For example, 100 tasks with parallelism 10 means 10 run at a time, but all 100 will eventually run.

Mistake

Jobs are always cheaper than Services.

Correct

Cost depends on usage. Jobs charge only for execution time, while Services charge for request processing and idle time (if min instances > 0). For continuous workloads, a Service with min instances=0 can be cheaper. For batch workloads, Jobs are typically more cost-effective.

Mistake

Execution records are kept forever.

Correct

By default, execution records are retained for 30 days. You can configure retention from 1 to 365 days using the `--execution-retention` flag.

Mistake

Jobs can only run one task at a time.

Correct

By default, parallelism is 1, but you can increase it up to 100 (soft limit) to run multiple tasks concurrently.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Cloud Run Services and Cloud Run Jobs?

Cloud Run Services are designed to handle HTTP requests continuously, scaling to zero when idle. They are ideal for web applications and APIs. Cloud Run Jobs are for batch workloads that run to completion, such as data processing or migrations. Jobs do not have an HTTP endpoint; they are triggered via API, Cloud Scheduler, or Eventarc. Use Services for request-driven workloads, Jobs for task-oriented batch processing.

How do I run a Cloud Run Job on a schedule?

Use Cloud Scheduler to trigger the job via HTTP request to the Cloud Run Jobs API. Create a Cloud Scheduler job with target type 'HTTP', URL `https://{region}-run.googleapis.com/apis/run.googleapis.com/v1/namespaces/{project-id}/jobs/{job-name}:run`, and set the HTTP method to POST. You'll need to authenticate with an OIDC token. Alternatively, use Workflows with a scheduled trigger.

Can Cloud Run Jobs access resources in a VPC?

Yes. You can configure VPC egress for a job using either direct VPC (if the job is in the same region) or Serverless VPC Access connector. This allows the job to reach resources with private IPs, such as Cloud SQL instances or on-premises databases via VPN.

What happens when a task in a Cloud Run Job fails?

By default, the task is retried up to 3 times. If all retries fail, the task is marked as failed. The overall execution fails if any task fails after retries. You can configure the number of retries (set to 0 for no retries). Retries are immediate; there is no built-in backoff. You can implement custom backoff in your application.

How are Cloud Run Jobs billed?

You are billed for the compute time (vCPU and memory) used during job execution, from the start of the first task to the completion of the last task. There is no charge for idle time between executions. You also pay for any networking egress, Cloud Storage mounts, and Secret Manager access. Pricing is per second with a minimum of 1 minute.

What is the maximum duration for a Cloud Run Job task?

The maximum timeout per task is 24 hours. The default is 10 minutes. You can set the timeout when creating or updating the job using the `--timeout` flag (e.g., `--timeout=3600s` for 1 hour).

Can I run multiple tasks in parallel in a Cloud Run Job?

Yes. Set the `--parallelism` flag to the desired number of concurrent tasks. The maximum is 100 (soft limit). Tasks are distributed across available resources. Ensure your container can handle multiple instances running simultaneously.

Terms Worth Knowing

Azure Functions Cloud computing Cloud Run Lambda Region

Ready to put this to the test?

You've just covered Cloud Run Jobs — now see how well it sticks with free ACE practice questions. Full explanations included, no account needed.

Try ACE practice questions Back to all chapters

Done with this chapter?

GKE Private Clusters

Cloud Run VPC Connector

See the full ACE study guide