Your GKE cluster is running a critical web application that experiences predictable traffic spikes during business hours. You want to minimize latency and avoid pod startup delays during scaling. The application uses CPU-intensive image processing. Which scaling strategy should you use?
Trap 1: Set a high number of static pods equal to peak traffic; use cluster…
Overprovisioning pods wastes resources and cost; cluster autoscaler does not prevent cold starts.
Trap 2: Use VPA with updateMode: Auto to automatically adjust pod…
VPA adjusts resource requests but does not prevent cold starts; it may even cause pod restarts, increasing latency.
Trap 3: Deploy a CronJob to scale up replicas before business hours; rely…
CronJob scaling is not reactive to actual traffic and may be too coarse; HPA alone with min replicas is simpler and more responsive.
- A
Set a high number of static pods equal to peak traffic; use cluster autoscaler to add nodes.
Why wrong: Overprovisioning pods wastes resources and cost; cluster autoscaler does not prevent cold starts.
- B
Use VPA with updateMode: Auto to automatically adjust pod resources; enable cluster autoscaler to add nodes as required.
Why wrong: VPA adjusts resource requests but does not prevent cold starts; it may even cause pod restarts, increasing latency.
- C
Deploy a CronJob to scale up replicas before business hours; rely on HPA to handle the rest.
Why wrong: CronJob scaling is not reactive to actual traffic and may be too coarse; HPA alone with min replicas is simpler and more responsive.
- D
Configure HPA with a minimum of 2 replicas and scale on CPU utilization; enable cluster autoscaler for node provisioning.
HPA with min replicas ensures baseline capacity to absorb spikes without cold starts; cluster autoscaler adds nodes as needed.