This chapter covers cost optimization strategies on Google Cloud, a critical topic for the GCDL exam (Objective 1.2: Digital Transformation). Approximately 15-20% of exam questions touch on cost management, pricing models, and optimization techniques. You'll learn how to reduce waste, choose the right pricing plans, and leverage Google Cloud's native tools to control spending without sacrificing performance or reliability.
Jump to a section
Imagine a buffet dinner where you pay per item you take. You have a large tray, and you can take as much as you want, but you're charged for every ounce. Now, you could grab a huge pile of expensive lobster, but if you only eat half, you still pay for the full pile. Alternatively, you could take smaller portions of cheaper items and only what you'll eat. Google Cloud's cost optimization is like being a savvy buffet diner: you choose the right-sized plate (instance type), take only what you need (rightsizing), and use a loyalty card (committed use discounts) to get a discount on your favorite dishes. You also get a free drink if you dine during off-peak hours (preemptible VMs for batch jobs). The key is to avoid waste—don't pile on expensive items you won't consume, and always check the menu (pricing model) before you fill your plate.
Understanding Cloud Cost Fundamentals
Cost optimization on Google Cloud begins with understanding the pay-per-use model. Unlike on-premises where you buy hardware upfront, cloud costs are operational expenses (OpEx) that scale with usage. The key levers are compute, storage, networking, and managed services. For the GCDL exam, you need to know the four pillars of cost optimization: - Rightsizing: matching resources to actual demand - Discounts: using committed use, sustained use, and preemptible VMs - Storage optimization: choosing the right storage class and lifecycle policies - Architecture optimization: using serverless, autoscaling, and managed services
Compute Pricing Models
Google Cloud offers three primary compute pricing models:
On-Demand: Pay per second (minimum 1 minute) for compute instances. No commitment, highest cost per hour. Ideal for short-term, unpredictable workloads.
Committed Use Discounts (CUD): In exchange for a 1- or 3-year commitment to spend a minimum amount (e.g., $1,000/month on vCPUs and memory), you get up to 57% discount for most machine types. CUDs are applied automatically to matching resources across your project. You can purchase CUDs for: - vCPU and memory (general-purpose, memory-optimized, compute-optimized) - GPUs (separate commitment) - Local SSDs (separate commitment)
Sustained Use Discounts (SUD): Automatic discounts for running instances a significant portion of the month. For example, if you run a VM for more than 25% of a month, you get a discount that increases with usage, up to 30% for the entire month. SUDs apply per instance, per region, and are calculated monthly.
Preemptible VMs: Compute instances that can be terminated at any time (within 24 hours) but cost 60-91% less than on-demand. Ideal for batch jobs, fault-tolerant workloads, and stateless applications. No SLA.
Storage Pricing and Optimization
Google Cloud storage options have different cost profiles:
Cloud Storage: Object storage with four classes: - Standard: High durability, low latency, higher cost per GB. For frequently accessed data. - Nearline: For data accessed less than once a month. Lower storage cost, higher retrieval cost. - Coldline: For data accessed less than once a quarter. Even lower storage cost. - Archive: For data accessed less than once a year. Lowest storage cost, highest retrieval cost.
Lifecycle policies can automatically move objects between classes to optimize cost. For example, move objects to Nearline after 30 days, Coldline after 90 days, and delete after 365 days.
Persistent Disks: Block storage for VMs. Costs vary by type: - pd-standard: HDD, lowest cost, suitable for bulk storage - pd-balanced: Balanced performance and cost - pd-ssd: High performance, higher cost - pd-extreme: Highest performance, highest cost
Snapshots and images also incur storage costs. Delete unused disks to save.
Networking Costs
Data transfer costs can surprise. Key points: - Ingress: Free (data into Google Cloud) - Egress: Charged for data leaving Google Cloud to internet or other regions - Within a region: Free between VPCs (using internal IPs) - Between regions: Charged per GB - Premium Tier vs Standard Tier: Premium Tier uses Google's global network for egress, costing more but lower latency. Standard Tier uses internet, cheaper but variable performance.
Use Cloud CDN to cache content and reduce egress costs. Use VPC peering to keep traffic within Google's network.
Managed Services and Serverless
Serverless services like Cloud Functions, Cloud Run, and App Engine charge only for resources consumed during execution (e.g., per 100ms of compute time, per request). This eliminates idle costs. However, watch out for: - Always-on instances: App Engine flexible environment has at least one instance always running - Memory allocation: Cloud Run charges for allocated memory during request processing
Autoscaling and Rightsizing
Autoscaling adjusts the number of VM instances based on load. Use managed instance groups (MIGs) with autoscaling to avoid over-provisioning. Rightsizing involves analyzing historical usage and downscaling or changing machine types. Google Cloud's Recommender provides rightsizing recommendations based on CPU, memory, and network utilization.
Budgets and Alerts
Set budgets at the project or billing account level. Create alerts at 50%, 90%, and 100% of budget. Use budget export to Pub/Sub for automated responses (e.g., shut down non-critical resources).
Cost Breakdown by Service
The Google Cloud Pricing Calculator estimates costs. The Billing Reports and Cost Table show spend by service, project, and labels. Use labels to tag resources (e.g., environment: production, cost-center: marketing) for cost allocation.
Quotas and Limits
Quotas prevent runaway costs. They are soft limits that can be increased by request. For example, you might have a quota of 100 CPUs per region. If you exceed it, new instances fail to create. Monitor quota usage to avoid unexpected denials.
Best Practices Summary
Use committed use discounts for predictable workloads
Use preemptible VMs for batch and fault-tolerant jobs
Implement lifecycle policies on Cloud Storage
Delete unused resources (disks, IPs, snapshots)
Use autoscaling and rightsizing
Set budgets and alerts
Use labels for cost allocation
Choose appropriate machine types (e.g., custom machine types for fine-grained resources)
Consider using Spot VMs (preemptible with no max lifetime) for even lower cost
Use Cloud CDN and optimize egress
Exam Relevance
The GCDL exam expects you to identify which cost optimization strategy applies to a given scenario. You won't need to calculate exact savings, but you must know the relative discount percentages (e.g., CUD up to 57%, preemptible 60-91%, SUD up to 30%). Also know that sustained use discounts are automatic, while committed use discounts require a purchase.
Identify Current Usage Patterns
Use the Google Cloud Pricing Calculator and Billing Reports to analyze your current spend. Focus on compute usage: average CPU/memory utilization, instance types, and runtime hours. Also review storage usage: data access frequency, size, and class. This baseline helps identify waste, such as idle VMs or over-provisioned resources. For example, if a VM runs at 10% CPU, it's a candidate for rightsizing.
Apply Rightsizing Recommendations
Use the Recommender to get machine type recommendations. For each underutilized VM, change to a smaller machine type (e.g., from n1-standard-8 to n1-standard-2). This reduces per-hour cost. For workloads with variable load, consider autoscaling with a managed instance group. The Recommender also suggests switching to custom machine types for fine-tuned resource allocation.
Choose Appropriate Discounts
For predictable workloads (e.g., production servers running 24/7), purchase committed use discounts (CUDs) for vCPUs and memory. For batch jobs that can tolerate interruptions, use preemptible VMs. For workloads running more than 25% of a month, sustained use discounts automatically apply. Combine these: e.g., run a baseline with CUDs and scale with preemptible VMs.
Optimize Storage Lifecycle
Implement Cloud Storage lifecycle policies to move older data to cheaper storage classes. For example, move logs to Nearline after 30 days, then to Coldline after 90 days, and delete after 365 days. Also, delete unused persistent disks and snapshots. Use regional or multi-regional storage only for frequently accessed data; use dual-region for disaster recovery.
Set Budgets and Alerts
Create budgets at the project level (e.g., $10,000/month) and set alerts at 50%, 90%, and 100% of the budget. Export budget data to Pub/Sub and trigger automated actions, such as shutting down non-critical VMs or sending notifications to the team. Use labels to track costs by department or environment.
Enterprise Scenario 1: E-commerce Platform with Seasonal Spikes
A large e-commerce company runs its website on Google Cloud. During Black Friday, traffic spikes 10x. To avoid over-provisioning year-round, they use a baseline of committed use discounts (CUDs) for 50% of their peak capacity, and autoscaling with preemptible VMs for the remaining traffic. They also use Cloud CDN to cache product images, reducing egress costs. They set budgets and alerts to monitor spend in real-time. Misconfiguration: initially, they didn't set up autoscaling properly, causing some instances to be terminated during traffic spikes. They fixed it by setting a higher cooldown period and using managed instance groups with health checks.
Enterprise Scenario 2: Data Analytics Pipeline
A financial services company processes terabytes of data daily using Dataproc (Hadoop/Spark). They use preemptible VMs for worker nodes, saving 70% on compute costs. They store raw data in Cloud Storage with lifecycle policies: after 30 days, data moves to Nearline; after 1 year, to Archive. They also use Cloud Composer (Airflow) to orchestrate jobs and automatically shut down clusters when idle. Common mistake: not setting a maximum cluster size, leading to uncontrolled scaling. They now set a max node limit and use quotas to cap spend.
Enterprise Scenario 3: SaaS Application with Multi-Tenancy
A SaaS provider hosts customer environments on Google Cloud. Each customer gets a separate project with budgets and alerts. They use labels to tag resources by customer ID and environment (dev, test, prod). They use committed use discounts for production workloads and preemptible VMs for dev/test. They also use Cloud SQL with automatic storage increase and set a max storage limit to avoid surprise bills. Misconfiguration: forgetting to delete old snapshots caused storage costs to balloon. They implemented a lifecycle policy to automatically delete snapshots older than 30 days.
What the GCDL Exam Tests
Objective 1.2 (Digital Transformation) expects you to identify cost optimization strategies for different scenarios. You must know:
The difference between on-demand, committed use, sustained use, and preemptible pricing
Typical discount ranges: CUD up to 57%, SUD up to 30%, preemptible 60-91%
That sustained use discounts are automatic, while committed use discounts require a 1- or 3-year commitment
That preemptible VMs have no SLA and can be terminated at any time (max 24 hours)
That Cloud Storage lifecycle policies can move data to cheaper classes automatically
That budgets and alerts are set at the project or billing account level
Common Wrong Answers
Choosing sustained use discounts for predictable workloads: Candidates often pick SUD because it's automatic, but CUDs offer higher discounts (up to 57% vs 30%) and are better for predictable, always-on workloads.
Using preemptible VMs for stateful applications: Preemptible VMs are designed for fault-tolerant, stateless workloads. Using them for databases or stateful apps will cause data loss.
Thinking all egress is free: Ingress is free, but egress (data leaving Google Cloud) is charged. Candidates forget that data transfer between regions also costs.
Assuming Cloud Storage Nearline is cheaper than Standard for frequently accessed data: Nearline has lower storage cost but higher retrieval cost. It's only cheaper for data accessed less than once a month.
Specific Numbers and Terms
CUD discount: up to 57% for vCPU and memory
SUD discount: up to 30% for running more than 25% of a month
Preemptible VM discount: 60-91% off on-demand
Cloud Run charges per 100ms of compute time
Cloud Storage retrieval fees: Nearline $0.01/GB, Coldline $0.02/GB, Archive $0.05/GB
Budget alerts can be set at 50%, 90%, 100% of budget
Edge Cases
Custom machine types: Can be more cost-effective than predefined types if you need specific resources. CUDs apply to custom types but only for vCPU and memory, not for GPUs or local SSDs.
Spot VMs: Similar to preemptible but no maximum lifetime. Use for longer-running batch jobs.
Reservations: You can reserve capacity with CUDs to guarantee availability, but you pay even if you don't use it.
How to Eliminate Wrong Answers
Focus on the scenario's key words: - 'Predictable, always-on' → committed use discounts - 'Batch processing, fault-tolerant' → preemptible VMs - 'Infrequent access' → Nearline, Coldline, or Archive storage - 'Automatic discount' → sustained use discounts - 'Cost allocation' → labels - 'Limit spending' → budgets and alerts
Committed use discounts (CUD) offer up to 57% off for 1- or 3-year commitments, while sustained use discounts (SUD) offer up to 30% off automatically.
Preemptible VMs cost 60-91% less than on-demand but can be terminated at any time; use for batch and fault-tolerant workloads.
Cloud Storage lifecycle policies can automatically move data to cheaper storage classes (e.g., Standard to Nearline to Coldline to Archive).
Set budgets and alerts at 50%, 90%, and 100% of budget to avoid surprise bills.
Use labels to tag resources for cost allocation by department, environment, or project.
Rightsizing using the Recommender can reduce costs by matching machine types to actual utilization.
Data egress (outbound) is charged; ingress is free. Use Cloud CDN and VPC peering to reduce egress costs.
Serverless services (Cloud Functions, Cloud Run) eliminate idle costs by charging only for execution time.
Custom machine types allow fine-grained resource allocation and can be more cost-effective than predefined types.
Budgets do not automatically stop resources; use Pub/Sub and automation for that.
These come up on the exam all the time. Here's how to tell them apart.
Committed Use Discounts (CUD)
Requires a 1- or 3-year commitment to spend a minimum amount per month.
Discount up to 57% for vCPU and memory (depending on machine type and commitment length).
Must be purchased separately for each region and resource type.
Best for predictable, always-on workloads.
You pay even if you don't use the resources (if you commit to a minimum spend).
Sustained Use Discounts (SUD)
Automatic, no commitment required.
Discount up to 30% for running instances more than 25% of a month.
Applied per instance, per region, calculated monthly.
Best for workloads that run for a significant portion of the month but are not predictable enough for CUD.
No penalty if you stop using resources; discount adjusts automatically.
Mistake
Sustained use discounts require you to sign a contract or opt in.
Correct
Sustained use discounts are automatic and applied monthly based on usage. No contract or opt-in required. They apply per instance and per region.
Mistake
Preemptible VMs are always cheaper than on-demand, even for short workloads.
Correct
Preemptible VMs are cheaper per hour, but they can be terminated at any time. For short workloads that cannot tolerate interruption, on-demand may be more reliable and cost-effective overall.
Mistake
Committed use discounts apply to all resources in a project automatically.
Correct
CUDs apply only to the specific resource types (vCPU, memory, GPU, local SSD) that you committed to, and only to resources in the same region as the commitment. You must purchase CUDs separately for each region and resource type.
Mistake
Cloud Storage Nearline is the cheapest storage class.
Correct
Archive is the cheapest storage class ($0.0012/GB/month vs Nearline $0.01/GB/month). However, Archive has higher retrieval costs and minimum storage durations (365 days).
Mistake
Budgets can automatically stop resources when exceeded.
Correct
Budgets only send alerts; they do not automatically stop resources. To automate actions, you need to use Pub/Sub triggers and Cloud Functions or other automation tools.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Committed use discounts (CUD) require a 1- or 3-year commitment to a minimum spend (e.g., $1,000/month on vCPUs and memory) and offer up to 57% discount. Sustained use discounts (SUD) are automatic and apply when you run an instance for more than 25% of a month, offering up to 30% discount. CUD is better for predictable workloads; SUD is for variable usage.
No, preemptible VMs are not suitable for stateful applications because they can be terminated at any time (with 30-second notice). They are designed for fault-tolerant, stateless workloads like batch processing, data analytics, and rendering.
Go to the Billing console, create a budget at the project or billing account level, set a target amount (e.g., $10,000), and add alert thresholds (e.g., 50%, 90%, 100%). You can also set up Pub/Sub notifications to trigger automated actions like shutting down resources.
Archive storage is the cheapest at $0.0012/GB/month, but it has a 365-day minimum storage duration and higher retrieval costs ($0.05/GB). Use it for data accessed less than once a year.
Use Cloud CDN to cache content at edge locations, use VPC peering to keep traffic within Google's network, and choose Standard Tier (internet) for cheaper egress if latency is not critical. Also, compress data before transferring.
Labels are key-value pairs (e.g., environment: production, cost-center: marketing) that you attach to resources. They allow you to filter and group costs in Billing Reports, enabling cost allocation and chargeback to departments.
No, sustained use discounts only apply to on-demand and committed use instances. Preemptible VMs have their own pricing and are not eligible for SUD.
You've just covered Cost Optimisation Strategies on GCP — now see how well it sticks with free GCDL practice questions. Full explanations included, no account needed.
Done with this chapter?