ACEChapter 46 of 101Objective 1.2

Instance Templates and Managed Instance Groups

This chapter covers instance templates and managed instance groups (MIGs), two core building blocks for scalable, resilient applications on Google Compute Engine. Understanding these concepts is critical for the ACE exam, as questions on autoscaling, rolling updates, and instance group management appear frequently — typically 5-8% of the exam. You will learn how to define a reusable instance configuration, create groups that automatically heal and scale, and implement update strategies, all with specific commands and parameters that the exam tests directly.

25 min read
Intermediate
Updated May 31, 2026

Factory Assembly Line with Blueprint and Crew

Think of an instance template as a detailed blueprint for a product, and a managed instance group (MIG) as the automated assembly line that produces and maintains copies of that product. The blueprint specifies every component: the chassis (machine type), the engine (boot disk image), the wiring harness (network configuration), and the optional add-ons (metadata, startup scripts). Once the blueprint is finalized, the assembly line manager (the MIG controller) reads it and orders a robot arm (Compute Engine) to stamp out identical units. The manager also monitors each unit’s health using a simple test (health check) — if a unit fails, it is automatically removed from service and a new one is built from the same blueprint. If demand increases, the manager can be told to produce more units (autoscaling), up to a maximum capacity. If the blueprint is updated, the manager can perform a rolling update, replacing old units with new ones without stopping the entire line. Crucially, the assembly line does not care about the individual identity of each unit — they are interchangeable. If a unit is manually modified (e.g., a worker adds a custom sticker), the manager will eventually overwrite it during an update or recreate it, because the blueprint is the source of truth. This factory model ensures consistency, scalability, and self-healing, which is exactly how MIGs work in Google Cloud.

How It Actually Works

What Are Instance Templates?

An instance template is an immutable resource that defines the properties of a virtual machine instance. It acts as a blueprint for creating identical VMs. Key properties include: - Machine type (e.g., n1-standard-4) - Boot disk image or snapshot (e.g., debian-10-buster-v20250101) - Disk type and size (e.g., pd-standard, 10 GB) - Network and subnetwork (e.g., default with auto subnet mode) - Service account and scopes - Metadata (key-value pairs) - Startup and shutdown scripts - Tags and labels - Reservation affinity (for consuming specific reservations) - Scheduling options (preemptible, maintenance behavior) - Shielded VM and Confidential VM settings - Network interfaces, including IP aliasing and network tiers - Guest accelerators (GPUs) - Minimum CPU platform

Instance templates are regional (stored in a specific region) and immutable — you cannot modify a template after creation. To change properties, you must create a new template and update the MIG to use it. The template resource name must be unique per project.

Creating a template via gcloud:

gcloud compute instance-templates create my-template \
    --source-instance=source-vm \
    --source-instance-region=us-central1 \
    --configure-disk=auto-delete=false

Alternatively, you can specify all properties explicitly. The --source-instance flag is a convenient way to create a template from an existing VM, but note that the source instance must be stopped to capture a consistent state. If you use a running instance, the template captures the current disk state, which may include running processes.

What Are Managed Instance Groups?

A managed instance group (MIG) is a collection of identical VM instances created from a single instance template. The MIG controller automates: - Instance creation and deletion based on autoscaling policies - Health checking and automatic replacement of unhealthy instances - Rolling updates to transition to a new template - Resizing (manual or automatic) - Regional distribution (for regional MIGs)

MIGs come in two flavors: zonal (instances in one zone) and regional (instances spread across multiple zones in a region). Regional MIGs provide higher availability and are recommended for production workloads.

Autoscaling

Autoscaling adjusts the number of instances based on load. The scaling policy can use: - CPU utilization (target utilization, e.g., 0.6) - HTTP load balancing utilization (based on load balancer metrics) - Stackdriver monitoring metrics (any custom metric) - Cloud Monitoring metrics (e.g., pub/sub queue depth)

Key parameters: - cool-down-period: seconds to wait before starting scale-in after instance starts (default 60s) - max-num-replicas: maximum number of instances (required) - min-num-replicas: minimum number (default 1) - scale-in-control: controls how aggressively to scale down (e.g., max-scaled-in-replicas-percent) - overprovisioning: temporarily creates extra instances during scale-up to handle traffic spikes (enabled by default)

Autoscaling uses a signal-based approach. The autoscaler collects metrics every 60 seconds and decides whether to add or remove instances. The decision is based on the desired number of instances calculated from the metric value and target. For example, if CPU utilization is 80% and target is 60%, the autoscaler will add instances to bring utilization down.

Health Checking and Autohealing

Each MIG has a health check defined in the instance template or the MIG configuration. The health check is a simple probe (HTTP, HTTPS, TCP, or SSL) that determines if an instance is healthy. If an instance fails the health check for a configurable number of consecutive checks (default 2), the MIG considers it unhealthy and recreates it. The health check runs every 5 seconds by default (configurable via check-interval-sec).

Autohealing replaces unhealthy instances with new ones from the current template. This is distinct from autoscaling — autohealing maintains the current group size, while autoscaling changes the size.

Rolling Updates

Rolling updates allow you to update instances in a MIG to a new template without downtime. There are two strategies: - Proactive: The MIG automatically replaces instances according to a rollout configuration (max surge, max unavailable). - Opportunistic: The MIG does not automatically replace instances; it only updates instances that are stopped or deleted (e.g., during autohealing).

Key parameters: - max-surge: number of extra instances created beyond the target size during update (default 0) - max-unavailable: number of instances that can be taken down simultaneously (default 0) - min-ready-sec: minimum time an instance must run before considered ready (default 0)

Example proactive update:

gcloud compute instance-groups managed rolling-action start-update my-mig \
    --version=template=new-template \
    --max-surge=3 \
    --max-unavailable=1 \
    --region=us-central1

Stateful MIGs

By default, MIGs are stateless — all instances are identical and can be replaced. Stateful MIGs preserve state on specific disks or metadata per instance. You can define stateful configurations for: - Stateful disks: Disks that persist even if the instance is recreated (e.g., for databases). - Stateful metadata: Instance-specific metadata that survives updates.

Stateful MIGs are useful for workloads like Cassandra or Kafka where each node has unique data.

Reserved Resources

You can associate reservations with a MIG to ensure capacity. The MIG can be configured to use any-reservation (first available) or a specific reservation. This is critical for workloads that require guaranteed capacity, such as GPU instances.

Commands and Verification

List instance templates:

gcloud compute instance-templates list

Describe a template:

gcloud compute instance-templates describe my-template

Create a MIG:

gcloud compute instance-groups managed create my-mig \
    --template=my-template \
    --size=3 \
    --zone=us-central1-a

Set autoscaling:

gcloud compute instance-groups managed set-autoscaling my-mig \
    --max-num-replicas=10 \
    --min-num-replicas=2 \
    --target-cpu-utilization=0.6 \
    --cool-down-period=90 \
    --region=us-central1

Update template:

gcloud compute instance-groups managed set-instance-template my-mig \
    --template=new-template \
    --region=us-central1

Rolling update:

gcloud compute instance-groups managed rolling-action start-update my-mig \
    --version=template=new-template \
    --max-surge=1 \
    --max-unavailable=0 \
    --min-ready-sec=30 \
    --region=us-central1

Interaction with Other Services

Load Balancers: MIGs are often used as backends for HTTP(S) load balancers, network load balancers, or internal load balancers. The health check for the MIG can be the same as the load balancer's health check.

Cloud DNS: MIGs can be used with DNS-based load balancing (e.g., geo-routing) by registering the MIG's instances with Cloud DNS.

Cloud CDN: If the MIG serves content, Cloud CDN can cache responses to reduce load.

Secret Manager: Startup scripts can fetch secrets from Secret Manager using the instance's service account.

Cloud Storage: Instances can mount Cloud Storage buckets using gcsfuse.

Performance and Limits

Maximum number of instances per MIG: 2000 (zonal), 2000 (regional, but spread across zones).

Maximum number of MIGs per project: varies, typically 1000.

Autoscaler evaluation interval: 60 seconds.

Health check interval: minimum 1 second, default 5 seconds.

Rolling update timeouts: configurable, default 600 seconds per instance.

Instance templates: up to 1000 per project.

Best Practices

Use regional MIGs for production to survive zone failures.

Set meaningful health checks that test application readiness, not just TCP connectivity.

Use scale-in-control to prevent aggressive scaling down that causes thrashing.

For stateful workloads, use stateful MIGs or external persistent disks.

Always test rolling updates in a non-production environment first.

Use gcloud compute instance-groups managed list-instances to monitor instance status.

Walk-Through

1

Create an Instance Template

Define the VM blueprint using gcloud or the console. Specify all required properties: machine type, boot disk image or snapshot, network, service account, metadata, and startup script. The template is regional and immutable. For example, `gcloud compute instance-templates create web-template --image-family=debian-10 --image-project=debian-cloud --machine-type=e2-medium --subnet=default --tags=http-server`. The template is stored in the project and can be used by multiple MIGs. You can also create a template from an existing source instance using `--source-instance`.

2

Create a Managed Instance Group

Use the template to create a MIG. Specify the region or zone, initial size (number of instances), and optionally autoscaling, health check, and named ports. For example: `gcloud compute instance-groups managed create web-mig --template=web-template --size=3 --zone=us-central1-a`. The MIG controller immediately creates the instances. If you specify a regional MIG, instances are distributed across zones in the region. The MIG also creates a default health check if not specified.

3

Configure Autoscaling

Attach an autoscaler to the MIG to automatically adjust the number of instances based on load. Use `gcloud compute instance-groups managed set-autoscaling` with parameters like `--max-num-replicas`, `--min-num-replicas`, and `--target-cpu-utilization`. The autoscaler evaluates metrics every 60 seconds and adds or removes instances to meet the target. For example, setting CPU target to 0.6 means the autoscaler tries to keep average CPU at 60%. The cool-down period prevents scale-in immediately after a new instance starts.

4

Set Up Health Check and Autohealing

Define a health check for the MIG. This can be done at MIG creation or later with `gcloud compute instance-groups managed set-autohealing`. The health check probes instances at a configurable interval (default 5s). After a configurable number of consecutive failures (default 2), the instance is considered unhealthy and is recreated. For example: `gcloud compute instance-groups managed set-autohealing web-mig --health-check=http-health-check --initial-delay=120 --region=us-central1`. The initial delay gives the application time to start before health checks begin.

5

Perform a Rolling Update

When you need to update the application or instance configuration, create a new template and initiate a rolling update. Use `gcloud compute instance-groups managed rolling-action start-update` with parameters like `--version=template=new-template`, `--max-surge`, and `--max-unavailable`. The MIG gradually replaces instances, ensuring the specified surge and unavailable limits are respected. For example, `--max-surge=1` creates one extra instance before taking down an old one, maintaining capacity. The update can be proactive or opportunistic.

What This Looks Like on the Job

Scenario 1: Web Application Scaling A SaaS company runs a Node.js web app on Compute Engine. They use a regional MIG with 10 instances behind an HTTP(S) load balancer. The instance template includes a startup script that pulls the latest container image from Container Registry and runs it. Autoscaling is configured with CPU target 0.6, min replicas 3, max replicas 20. During a marketing campaign, traffic spikes and the autoscaler adds instances up to 20. The health check probes /health endpoint; if the app fails, the instance is replaced. Rolling updates are used to deploy new versions with max-surge=2 and max-unavailable=1. The company learned the hard way that setting max-surge=0 caused a brief outage when an instance took too long to start.

Scenario 2: Stateful Database Cluster A financial services firm runs Cassandra on a stateful MIG. Each instance has a stateful persistent disk (SSD) that stores data. The instance template specifies the disk as stateful via --stateful-disk. The MIG has a fixed size of 5 (no autoscaling). Autohealing is configured with a health check that verifies Cassandra is accepting queries. If a node fails, the MIG recreates it with the same disk attached, preserving data. The team uses rolling updates with --max-unavailable=1 to upgrade Cassandra versions. They avoid max-surge because adding extra nodes would cause unnecessary data replication in the cluster.

Scenario 3: CI/CD Build Farm A game development studio uses a zonal MIG of preemptible VMs for build jobs. The instance template uses a custom image with build tools. Autoscaling is based on a Cloud Pub/Sub queue depth metric — when jobs pile up, new instances are created. Preemptible instances are cheaper but can be terminated at any time. The MIG's autohealing replaces preempted instances. The team sets --min-num-replicas=0 to save cost when there are no jobs. They discovered that without a health check, instances that failed to start would remain in the group and waste resources. Now they use a health check that tests that the build agent is registered with the queue.

How ACE Actually Tests This

ACE Objective Coverage: This topic maps to Objective 1.2 (Configuring compute resources) and is tested in several question types. Expect 3-5 questions on instance templates and MIGs.

Common Wrong Answers and Why: 1. "You can modify an instance template after creation." This is false. Templates are immutable. Candidates confuse templates with instance configurations that can be updated. The exam tests this directly: if a question asks how to change the machine type of instances in a MIG, the correct answer is to create a new template and update the MIG. 2. "Autoscaling and autohealing are the same thing." They are not. Autoscaling changes the number of instances; autohealing replaces unhealthy instances while keeping the count constant. The exam may present a scenario where an instance fails and ask what happens — the answer is autohealing, not autoscaling. 3. "Stateful MIGs preserve all instance state by default." Only disks and metadata explicitly marked as stateful are preserved. The exam loves this nuance: if you need to preserve a specific disk, you must define it as stateful in the template or MIG. 4. "Regional MIGs distribute instances evenly across all zones in a region." Actually, the distribution is automatic but not necessarily even; the MIG tries to balance but may not achieve perfect evenness if zones have capacity constraints.

Numbers and Values to Memorize: - Default health check interval: 5 seconds. - Default health check unhealthy threshold: 2 consecutive failures. - Autoscaler evaluation interval: 60 seconds. - Default cool-down period: 60 seconds. - Max instances per MIG: 2000. - Rolling update default timeout per instance: 600 seconds. - max-surge and max-unavailable default: 0.

Edge Cases: - If a MIG has min-num-replicas=0 and there is no load, the autoscaler can reduce to 0 instances. But if you set a health check, the MIG will still try to maintain 0 instances — it won't recreate terminated instances. - When using a regional MIG with autoscaling, the autoscaler considers the aggregate load across all zones. - If you delete an instance template that is in use by a MIG, the MIG will continue to work but you cannot create new instances from that template. You must update the MIG to a new template before deleting the old one.

Elimination Strategy: For multiple-choice questions, look for keywords: "immutable" for templates, "autohealing" for health check replacement, "stateful" for preserved disks, "rolling update" for gradual changes. If an answer says "modify the template," it is wrong. If an answer says "autoscaling will replace the instance," it is wrong if the question is about a single instance failure.

Key Takeaways

Instance templates are immutable blueprints for VM instances; you must create a new template to change properties.

MIGs automate instance management: autoscaling, autohealing, rolling updates, and regional distribution.

Autoscaling uses metrics (CPU, LB utilization, custom) and evaluates every 60 seconds; cool-down period defaults to 60 seconds.

Autohealing replaces unhealthy instances based on health checks; default check interval is 5 seconds, unhealthy threshold is 2.

Rolling updates use max-surge and max-unavailable to control update pace; defaults are 0 for both.

Regional MIGs distribute instances across zones for high availability; zonal MIGs are simpler but less resilient.

Stateful MIGs preserve specific disks and metadata; all other state is ephemeral.

Maximum instances per MIG is 2000; maximum instance templates per project is 1000.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Zonal MIG

All instances in a single zone

Lower availability – a zone failure takes down the entire group

Simpler to manage and debug

Lower latency for intra-zone traffic

Can use zonal resources like zonal persistent disks

Regional MIG

Instances spread across multiple zones in a region

Higher availability – survives zone failures

More complex – requires cross-zone load balancing

Slightly higher inter-zone latency

Cannot use zonal disks that are tied to a single zone

Watch Out for These

Mistake

Instance templates can be updated after creation.

Correct

Instance templates are immutable. To change properties, you must create a new template and update the MIG to use it via `set-instance-template` or a rolling update.

Mistake

Autoscaling and autohealing are the same thing.

Correct

Autoscaling adjusts the number of instances based on load. Autohealing replaces unhealthy instances while keeping the group size constant. They are separate mechanisms.

Mistake

Regional MIGs automatically distribute instances evenly across all zones.

Correct

The MIG attempts to balance but distribution may not be perfectly even due to zone capacity constraints. You can use `--distribution-policy-target-shape=EVEN` to enforce evenness, but it may fail if capacity is insufficient.

Mistake

Stateful MIGs preserve all instance disks by default.

Correct

Only disks explicitly declared as stateful in the instance template or MIG configuration are preserved. Other disks are ephemeral and deleted when the instance is recreated.

Mistake

You can use a single instance template across multiple regions.

Correct

Instance templates are regional resources. You must create a separate template in each region where you want to use it, though the template definition can be identical.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

Can I change the machine type of instances in a managed instance group?

Yes, but you must create a new instance template with the desired machine type and then update the MIG to use that template. You can do this with `gcloud compute instance-groups managed set-instance-template` or via a rolling update. The instances will be recreated with the new machine type. You cannot change the template's properties directly because templates are immutable.

What is the difference between a managed instance group and an unmanaged instance group?

A managed instance group (MIG) is created from an instance template and automatically manages instances: it creates, deletes, heals, and updates them. An unmanaged instance group is a collection of existing instances that you manage manually; it does not support autoscaling, autohealing, or rolling updates. Unmanaged groups are rarely used in modern GCP deployments.

How do I preserve data on an instance when it is recreated by autohealing?

Use a stateful MIG and define the disk as stateful. You can specify stateful disks in the instance template or in the MIG configuration. For example, `gcloud compute instance-groups managed create my-mig --template=my-template --stateful-disk=device-name=my-disk,source=projects/.../disks/my-disk`. The disk will be reattached to the new instance.

What happens if I delete an instance template that is in use by a MIG?

The MIG will continue to operate, but you will not be able to create new instances using that template. If an instance needs to be recreated (e.g., due to autohealing), the operation will fail. You should update the MIG to a new template before deleting the old one.

Can I use a MIG with preemptible VMs?

Yes. You can set the instance template to use preemptible VMs. The MIG will create preemptible instances, and if they are preempted, the MIG will recreate them (if the group size is maintained). Autoscaling can also use preemptible VMs. Note that preemptible instances may not always be available, so consider using regular VMs for critical workloads.

How do I perform a rolling update without downtime?

Set `--max-surge` to a value greater than 0 (e.g., 1) to create extra instances before taking down old ones, and set `--max-unavailable` to 0 to ensure that the desired number of instances are always serving. This way, new instances start accepting traffic before old ones are terminated.

What are the limits for instance templates and MIGs per project?

By default, you can have up to 1000 instance templates per project. The maximum number of MIGs per project is typically 1000, but this can be increased via quota request. Each MIG can have up to 2000 instances. Regional MIGs have the same instance limit but spread across zones.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Instance Templates and Managed Instance Groups — now see how well it sticks with free ACE practice questions. Full explanations included, no account needed.

Done with this chapter?