This chapter covers rightsizing and resource optimization in Google Cloud, a critical skill for the Digital Leader exam. Rightsizing is the practice of continuously adjusting cloud resources (especially VM machine types) to match actual workload demand, eliminating waste and reducing costs. Approximately 10–15% of exam questions touch on this topic, often asking you to identify the best tool (Recommender, Committed Use Discounts, etc.) or interpret recommendations. You will learn the mechanics of Google Cloud's rightsizing recommendations, how to implement them, and common pitfalls that lead to exam mistakes. The focus is on practical, exam-relevant knowledge: default recommendation periods (8 days, 30 days), confidence levels, and the difference between rightsizing and autoscaling.
Jump to a section
Rightsizing a VM in Google Cloud is like hiring a master tailor to adjust a suit rather than buying a new one off the rack every time you gain or lose weight. In a traditional on-premises data center, you might buy a large server (the suit) expecting future growth, often overpaying for unused capacity. With Google Cloud, you can start with a standard-size VM (like a suit off the rack) and then, using rightsizing recommendations, have the 'tailor' (Google's recommender) measure exactly how much CPU, memory, and disk you actually use. The tailor doesn't just guess; he monitors your daily patterns—peak hours, idle times, and seasonal spikes—and suggests precise adjustments: maybe a smaller suit for weekdays and a slightly larger one for month-end processing. The 'seam' is the ability to change machine types without rebuilding the whole suit: you stop the VM, change the machine type, and restart—no re-architecture, no data migration. This is far more efficient than 'closet shopping' (buying new VMs and migrating workloads), which wastes time and risks data inconsistency. Rightsizing is about continuous measurement and adjustment, not one-time sizing. The Google Cloud Recommender acts as the tailor, analyzing metrics from Cloud Monitoring and providing actionable recommendations with confidence scores and cost savings estimates. Just as a tailor's adjustments are reversible (you can always let the seams back out), rightsizing is non-destructive: you can resize a VM up or down, or even change to a different family (e.g., from N2 to E2) as long as the underlying CPU platform supports it. The key insight: rightsizing is a continuous process, not a project. You cannot 'set and forget'—workloads change, and the tailor must revisit every quarter.
What is Rightsizing and Why Does It Exist?
Rightsizing is the process of resizing cloud resources (primarily Compute Engine VMs) to better match actual usage patterns. In on-premises environments, hardware is typically provisioned for peak load plus a safety margin, leading to significant underutilization. Google Cloud's pay-per-use model makes overprovisioning an immediate cost drain. Rightsizing aims to eliminate that waste by moving from 'sizing for peak' to 'sizing for typical' and then handling peaks with autoscaling or other elasticity mechanisms.
The Google Cloud Digital Leader exam expects you to understand that rightsizing is a continuous optimization activity, distinct from autoscaling (which adjusts capacity dynamically) and from purchasing commitments (like Committed Use Discounts). Rightsizing changes the *size* of individual resources; autoscaling changes the *number* of resources.
How Rightsizing Works Internally
Google Cloud's Rightsizing Recommendations are generated by the Recommender service, which analyzes VM utilization metrics collected by Cloud Monitoring. The process works as follows:
Data Collection: Cloud Monitoring agents (or guest-agent metrics) collect CPU utilization, memory utilization (if enabled), and network throughput every 60 seconds. For memory, you must install the Cloud Monitoring agent or use a custom metric; otherwise, memory recommendations are not available.
Aggregation: The Recommender aggregates these metrics over a lookback period (default 8 days, can be set to 30 days). It calculates percentiles: typically the 50th, 95th, and 99th percentiles for CPU and memory.
Recommendation Generation: The Recommender compares current VM size to a set of candidate machine types (within the same family or across families) and identifies the cheapest type that can handle the workload's peak demand (usually based on the 95th percentile). It also considers whether the workload is CPU-bound, memory-bound, or balanced.
Confidence Score: Each recommendation includes a confidence level: High, Medium, or Low. High confidence means the workload has stable patterns and the recommendation is very likely to succeed. Low confidence indicates erratic usage or insufficient data.
Cost Impact: The recommender calculates the monthly cost savings if the recommendation is applied.
Key Components, Values, Defaults, and Timers
Lookback period: Default 8 days. Can be changed to 30 days via the Cloud Console or API. The exam may ask: 'How many days of data does the Recommender use by default?' Answer: 8 days.
Percentile used: 95th percentile for CPU and memory. The exam might not ask the exact percentile, but know that it's not the average (50th) or maximum (99th/100th).
Confidence levels: High, Medium, Low. High means >80% chance that the recommendation will work without performance degradation.
Machine types: Recommendations may suggest moving to a different series (e.g., from n2-standard-4 to e2-standard-4) if the workload is not sensitive to CPU platform differences.
Memory recommendations: Only generated if the guest agent reports memory usage. Without the agent, only CPU-based recommendations are provided.
Configuration and Verification Commands
To list rightsizing recommendations via gcloud:
gcloud recommender recommendations list \
--project=PROJECT_ID \
--location=LOCATION \
--recommender=google.compute.instance.MachineTypeRecommender \
--format=jsonTo apply a recommendation (resize VM):
gcloud compute instances set-machine-type INSTANCE_NAME \
--zone=ZONE \
--machine-type=MACHINE_TYPENote: The VM must be in a stopped state to change its machine type. The exam may test that you must stop the VM first.
To view recommendations in the console: Compute Engine > Recommendations > Rightsizing recommendations.
Interaction with Related Technologies
Committed Use Discounts (CUDs): Rightsizing can affect CUDs. If you rightsize a VM that is covered by a CUD, you may lose the discount if the new machine type is not in the same resource pool. The Recommender will flag this.
Autoscaling: Rightsizing is complementary. Autoscaling handles variable load by adding/removing instances; rightsizing ensures each instance is appropriately sized. The exam may ask: 'Should you rightsize or autoscale first?' The answer is usually rightsize first, then autoscale.
Sole-tenant nodes: Rightsizing recommendations are available but may be limited by node constraints.
Preemptible VMs: Rightsizing is less relevant for preemptible VMs because they can be terminated at any time; instead, focus on using smaller instances or spot VMs.
Common Exam Trap: Rightsizing vs. Autoscaling
A frequent exam question describes a scenario with variable load and asks which tool to use. If the question mentions 'adjusting instance count dynamically', the answer is autoscaling. If it mentions 'resizing an individual VM to match its typical usage', the answer is rightsizing. Another trap: assuming rightsizing is a one-time activity. The exam emphasizes it should be performed periodically (quarterly).
Edge Cases
Memory-intensive workloads without agent: The Recommender cannot generate memory-based recommendations. The exam might present a scenario where a VM is memory-overloaded but CPU is low, and ask why no recommendation exists. Answer: memory metrics are missing.
Very short-lived VMs (under 8 days): The Recommender needs at least 8 days of data. For short-lived VMs, use a generic sizing guideline or migrate to a managed service.
GPU-attached VMs: Rightsizing recommendations for GPU are limited; you may need to manually resize.
Best Practices for GCDL Exam
Remember the default lookback: 8 days.
Remember that memory metrics require the Cloud Monitoring agent.
Understand that rightsizing is about *changing machine type*, not number of instances.
Know that the recommender uses 95th percentile, not average or max.
Be able to distinguish rightsizing from autoscaling and CUDs.
Enable Cloud Monitoring Agent
To get memory-based rightsizing recommendations, you must install the Cloud Monitoring agent on each VM. Without it, only CPU metrics are available. The agent sends memory usage every 60 seconds to Cloud Monitoring. This step is often overlooked; the exam assumes you know that memory recommendations require the agent. If a VM shows no memory recommendation, the most likely cause is a missing agent.
Collect Utilization Data
Once the agent is installed, Cloud Monitoring collects CPU and memory utilization metrics. The Recommender starts analyzing after at least 8 days of data. During this period, the VM must be running continuously. If the VM is stopped or preempted frequently, the lookback period extends. The exam may test that the default lookback is 8 days, but you can configure up to 30 days for more stable recommendations.
Generate Recommendations
After sufficient data, the Recommender computes the 95th percentile of CPU and memory usage. It then compares the current machine type to all possible machine types (within the same family and across families) and selects the cheapest that can handle the 95th percentile load. The recommendation includes the suggested machine type, estimated monthly savings, and a confidence level. The exam may ask: 'What percentile does the Recommender use?' Answer: 95th.
Review and Apply Recommendation
In the Cloud Console, navigate to Compute Engine > Recommendations. Each recommendation shows current vs. recommended machine type, cost difference, and confidence. To apply, you must stop the VM (set-machine-type command or console action). The VM's internal IP will change unless you use a static IP. The exam tests that you must stop the VM before resizing. Also, if the VM is part of a managed instance group, you must update the instance template instead.
Monitor Post-Change Performance
After resizing, monitor the VM's performance for at least a few days to ensure it meets workload demands. If performance degrades (e.g., CPU consistently above 80%), consider rolling back or choosing a larger machine type. The Recommender may generate a new recommendation after the change. The exam emphasizes that rightsizing is iterative; you should review recommendations quarterly.
Enterprise Scenario 1: E-commerce Platform with Seasonal Spikes
A large online retailer runs its catalog service on 50 n2-standard-8 VMs. During Black Friday, load spikes 5x, but for 11 months, average CPU is 15% and memory 20%. The cloud engineer uses rightsizing recommendations to downsize to n2-standard-4 for the base fleet. For seasonal peaks, they configure a managed instance group with autoscaling based on CPU utilization, adding up to 50 extra n2-standard-4 VMs. This reduces baseline costs by 50% while maintaining performance. The common mistake: rightsizing the entire fleet to handle peak load, negating the benefit of cloud elasticity. The correct approach is to rightsize for typical load and autoscale for peaks.
Scenario 2: Financial Services with Strict Compliance
A bank runs a risk-analysis application on 100 n2-highmem-32 VMs. The application is memory-bound but CPU-light. The engineer installs Cloud Monitoring agents and after 30 days receives recommendations to switch to n2-highmem-16, cutting memory in half. However, the bank's compliance team requires that all VMs be on committed use discounts (CUDs) for cost predictability. The engineer must check if the new machine type is covered by existing CUDs. If not, they may need to purchase new CUDs or accept on-demand pricing. The exam might ask: 'What is a risk of applying rightsizing recommendations to CUD-covered VMs?' Answer: losing the discount if the new machine type is not in the same CUD pool.
Scenario 3: DevOps Test Environments
A startup runs 20 preemptible VMs for CI/CD. These VMs are short-lived (average 4 hours). Rightsizing recommendations are not generated because the lookback period requires 8 days of continuous data. Instead, the team uses a fixed machine type (e2-standard-2) chosen based on benchmarking. The exam may present a scenario with short-lived VMs and ask why no recommendations appear. The answer: insufficient data (less than 8 days).
Common Misconfigurations
Applying rightsizing recommendations without checking if the VM is part of a managed instance group; you must update the instance template, not just the individual VM.
Resizing a VM without a static IP, causing IP changes that break client connections.
Ignoring confidence levels: applying a low-confidence recommendation without manual validation.
Not reinstalling the Cloud Monitoring agent after a machine type change (if the agent was tied to the old kernel).
GCDL Exam Focus on Rightsizing
The Google Cloud Digital Leader exam tests rightsizing under Domain 1 (Digital Transformation) Objective 1.2: 'Identify key Google Cloud tools for optimizing cloud costs.' Specific objectives include:
Recognizing the Google Cloud Recommender as the tool for rightsizing.
Understanding the difference between rightsizing (changing machine type) and autoscaling (changing instance count).
Knowing default parameters: 8-day lookback, 95th percentile, confidence levels.
Identifying when memory recommendations are unavailable (missing agent).
Most Common Wrong Answers and Why
'Use autoscaling instead of rightsizing' – This is wrong when the question asks about resizing a single VM. Autoscaling adjusts count, not size. Candidates confuse the two because both optimize cost.
'Rightsizing is a one-time activity' – The exam emphasizes continuous optimization. Candidates think it's a project, but it's a process.
'The Recommender uses average (50th percentile) utilization' – Many assume average is used, but the Recommender uses the 95th percentile to avoid underprovisioning.
'You can resize a running VM' – The exam tests that you must stop the VM. Candidates may think live migration allows it, but for machine type changes, stop is required.
Specific Numbers and Terms
8 days: Default lookback period.
30 days: Maximum configurable lookback.
95th percentile: Utilization metric used.
Confidence levels: High, Medium, Low.
MachineTypeRecommender: The recommender ID.
Cloud Monitoring agent: Required for memory recommendations.
Edge Cases and Exceptions
GPU VMs: Rightsizing recommendations may not be available; manual sizing required.
VMs with local SSDs: Machine type changes may not be supported; check documentation.
Preemptible VMs: Not eligible for recommendations due to short lifetime.
VMs in managed instance groups: Apply changes to the instance template, not individual VMs.
How to Eliminate Wrong Answers
If the question mentions 'dynamic scaling based on load' and 'adding/removing instances', the answer is autoscaling, not rightsizing.
If the question says 'adjusting the size of an existing VM', it's rightsizing.
If the question says 'cost optimization tool that analyzes historical utilization', it's the Recommender.
If the question says 'requires 8 days of data', it's rightsizing recommendations.
If the question says 'memory recommendations not available', suspect missing agent.
Rightsizing adjusts VM machine types to match actual usage; it is not the same as autoscaling.
The Google Cloud Recommender generates rightsizing recommendations using a default 8-day lookback period.
The Recommender uses the 95th percentile of CPU and memory utilization (memory only if agent installed).
You must stop a VM before changing its machine type.
Memory recommendations require the Cloud Monitoring agent.
Rightsizing should be performed regularly (quarterly) as a continuous process.
For managed instance groups, apply rightsizing changes to the instance template, not individual VMs.
High confidence recommendations have >80% likelihood of success; low confidence needs manual validation.
These come up on the exam all the time. Here's how to tell them apart.
Rightsizing (Machine Type Change)
Changes the size (machine type) of individual VMs.
Requires stopping the VM to apply.
Based on historical utilization (95th percentile over 8-30 days).
Reduces cost by matching resource capacity to typical demand.
Best for stable, predictable workloads.
Autoscaling (Instance Count Change)
Changes the number of VM instances in a group.
Works on running instances without stopping (adds/removes).
Based on real-time metrics (CPU, load balancing capacity, etc.).
Reduces cost by scaling down during low demand.
Best for variable workloads with unpredictable spikes.
Mistake
Rightsizing and autoscaling are the same thing.
Correct
Rightsizing changes the machine type (size) of a VM to match typical usage. Autoscaling changes the number of VM instances dynamically based on load. They are complementary but distinct. The exam asks you to differentiate them.
Mistake
The Recommender always includes memory recommendations.
Correct
Memory recommendations require the Cloud Monitoring agent to be installed on the VM. Without it, only CPU-based recommendations are provided. The exam may present a scenario with no memory recommendation and ask why.
Mistake
You can resize a VM while it is running.
Correct
To change the machine type of a VM, you must first stop the VM. Live resize is not supported for machine type changes (though you can change some properties like custom machine type limits while running). The exam tests this prerequisite.
Mistake
Rightsizing is a one-time project done during migration.
Correct
Rightsizing is a continuous process. Workloads change over time, so you should review recommendations quarterly. The exam emphasizes ongoing optimization.
Mistake
The Recommender uses average utilization to make recommendations.
Correct
It uses the 95th percentile of utilization to ensure adequate capacity for peak loads within the observation period. Using average would lead to underprovisioning.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Google recommends reviewing rightsizing recommendations at least quarterly. Workloads change over time, and a machine type that was appropriate six months ago may now be overprovisioned or underprovisioned. The exam emphasizes that rightsizing is an ongoing process, not a one-time event.
Rightsizing changes the machine type (size) of a single VM instance to match its typical usage. Autoscaling changes the number of instances in a group based on real-time load. Rightsizing is about fitting the instance to the workload; autoscaling is about fitting the fleet to the demand. Both optimize cost but operate at different levels.
Memory recommendations require the Cloud Monitoring agent to be installed on the VM. Without the agent, Google Cloud cannot collect memory utilization data. Install the agent (or use a custom metric) and wait at least 8 days for recommendations to appear. The exam tests this prerequisite.
No. To change the machine type, you must stop the VM. However, you can change some properties like adding GPUs or increasing disk size while the VM is running. The exam often asks: 'What must you do before changing a VM's machine type?' Answer: Stop the VM.
The Recommender uses the 95th percentile of CPU and memory utilization. This ensures the recommended machine type can handle peak loads within the observation period, avoiding underprovisioning. The average (50th percentile) is not used because it would lead to frequent performance issues.
You cannot resize individual VMs in a managed instance group directly. Instead, you must update the instance template with the new machine type and then perform a rolling update or recreate the instances. The exam may test that rightsizing for MIGs requires template changes.
The default lookback period is 8 days. You can configure it up to 30 days for more stable recommendations. The exam often asks for the default value (8 days).
You've just covered Rightsizing and Resource Optimisation — now see how well it sticks with free GCDL practice questions. Full explanations included, no account needed.
Done with this chapter?