This chapter covers Terraform best practices specifically for Google Cloud Platform (GCP) deployments. On the ACE exam, approximately 10–15% of questions touch Infrastructure as Code (IaC) concepts, with Terraform being the primary tool tested. You will need to understand state management, module structure, remote backends, and how to integrate with GCP services. This chapter provides the exact practices and configurations you must know to pass the exam and to build production-grade GCP environments.
Jump to a section
Imagine you are a construction manager building a house. Instead of manually hammering each nail, you create a detailed blueprint that specifies every material, dimension, and connection. You hand this blueprint to a general contractor (Terraform) who reads it and orders materials, hires subcontractors, and schedules work. The blueprint is stored in version control, so you can track changes, roll back to a previous design, or reuse it for another house. If you need to add a room, you update the blueprint and the contractor automatically figures out what to add without demolishing the existing structure. The contractor keeps a state file (a real-time ledger) of what has been built, so it knows the current foundation is concrete, not wood. If someone manually changes a wall, the next time the contractor runs, it detects the drift and restores the wall to match the blueprint. This prevents unauthorized modifications and ensures every house is identical. The contractor also supports modules: you can reuse a pre-approved kitchen design across multiple houses, saving time and enforcing consistency. In this analogy, Terraform is the contractor, the blueprint is your HCL configuration, and the state file is the contractor's ledger. Google Cloud is the lumber yard and tool supplier.
Overview of Terraform on GCP
Terraform is an open-source Infrastructure as Code (IaC) tool by HashiCorp that allows you to define and provision GCP resources using declarative configuration files. The core workflow is: write configuration files (.tf files), run terraform init to initialize the working directory, terraform plan to preview changes, and terraform apply to execute them. On GCP, Terraform uses the Google Cloud Provider (hashicorp/google) to interact with GCP APIs.
Key Components
Provider: The Google Cloud Provider (hashicorp/google) requires a project ID and credentials. The provider block specifies the region and zone defaults.
Resources: Each GCP service (e.g., google_compute_instance, google_storage_bucket) is a resource block with arguments.
Data Sources: Read-only queries to fetch existing GCP resources (e.g., data.google_compute_image.ubuntu).
Variables and Outputs: Variables make configurations reusable; outputs expose resource attributes.
State: Terraform stores the mapping between your config and real-world resources in a state file (default: terraform.tfstate).
State Management Best Practices
The state file is critical. It contains sensitive data (e.g., instance IPs, service account keys) and must be stored securely. Never commit `terraform.tfstate` to version control if it contains secrets. Instead, use a remote backend such as Google Cloud Storage (GCS) with versioning and encryption.
Example backend configuration:
erraform {
backend "gcs" {
bucket = "my-tf-state-bucket"
prefix = "terraform/state/production"
}
}Use a dedicated GCS bucket per project or environment.
Enable Object Versioning on the bucket to recover from corruption.
Use Customer-Managed Encryption Keys (CMEK) or Customer-Supplied Encryption Keys (CSEK) for state encryption.
Set appropriate IAM permissions: only CI/CD service accounts and senior engineers should have write access.
For state locking, GCS supports object-based locking via terraform lock when using the GCS backend.
Module Structure and Reusability
Modules are self-contained packages of Terraform configurations. Best practices:
- Use the Terraform Registry: Many GCP modules are publicly available (e.g., terraform-google-modules/network/google).
- Organize by environment: Have separate directories for dev, staging, prod that call shared modules.
- Pin module versions: Specify version constraints (e.g., source = "terraform-google-modules/network/google" version = "~> 4.0").
- Minimal root module: The root module should only contain provider configuration, backend, and module calls. No resource definitions in root.
Example directory structure:
terraform/
├── modules/
│ ├── networking/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── compute/
├── dev/
│ ├── main.tf
│ ├── terraform.tfvars
│ └── backend.tf
└── prod/Resource Naming and Tagging
Consistent naming is crucial for manageability. Use a standard format like {prefix}-{environment}-{resource-type}-{name}. Example: prd-web-instance-001. Use labels (GCP's version of tags) on all resources for cost tracking and automation. The google_labels argument is available on most resources.
Version Control and CI/CD
Store all Terraform code in a Git repository.
Use feature branches and pull requests for changes.
Run terraform fmt and terraform validate in CI pipelines.
Use terraform plan as a gate before merging.
Automate terraform apply via Cloud Build, Jenkins, or GitLab CI.
Security and Secrets Management
Never hardcode secrets in .tf files. Use variables with sensitive = true and pass them via environment variables or a secrets manager like Google Secret Manager.
Use the google_secret_manager_secret_version data source to fetch secrets at runtime.
For service account keys, use short-lived credentials or workload identity federation.
Enable IAM conditions on service accounts used by Terraform to restrict where and when they can be used.
Testing Terraform Configurations
Use terraform validate to check syntax.
Use terraform plan to review changes.
For unit testing, use Terratest (Go-based) or tfsec for security scanning.
For integration testing, create isolated GCP projects and run terraform apply then terraform destroy.
Managing Multiple Environments
Use separate state files for each environment. The recommended approach is to use a single GCS bucket with different prefixes:
ackend "gcs" {
bucket = "my-tf-state-bucket"
prefix = "env/prod"
}For dev:
backend "gcs" {
bucket = "my-tf-state-bucket"
prefix = "env/dev"
}Alternatively, use Terraform Workspaces: terraform workspace new prod creates a separate state file within the same backend.
Using Terraform with GCP Services
Compute Engine: Use google_compute_instance with boot_disk and network_interface. Prefer google_compute_instance_template for managed instance groups.
GKE: Use google_container_cluster and google_container_node_pool. Enable private_cluster_config for security.
Cloud Storage: Use google_storage_bucket with uniform_bucket_level_access and versioning.
IAM: Use google_project_iam_member or google_project_iam_binding. Prefer google_project_iam_member to avoid overwriting existing bindings.
VPC: Use google_compute_network, google_compute_subnetwork, and google_compute_firewall. Consider using the terraform-google-modules/network/google module.
Drift Detection and Remediation
Terraform does not automatically detect changes made outside of Terraform (drift). To detect drift, run terraform plan and compare the output to the actual state. For remediation, you can either update the config to match the drift or run terraform apply to revert to the config. Use Google Cloud Asset Inventory to detect changes to resources and trigger a Terraform plan.
Performance and Scalability
For large infrastructures, use -parallelism=N flag to control concurrent resource operations (default is 10).
Use terraform refresh sparingly; it can be slow on large state files.
Consider splitting state into multiple files using terraform_remote_state data sources to reference outputs from other state files.
Common Pitfalls on the ACE Exam
State file location: The exam may present a scenario where state is stored locally and the user cannot apply from another machine. The correct answer is to use a remote backend.
Provider version: Always specify a provider version constraint (e.g., required_providers) to avoid unexpected upgrades.
Variable precedence: Variables set via -var flag override terraform.tfvars, which override default values.
Sensitive variables: Marking a variable as sensitive = true prevents it from being displayed in plan output but does not encrypt it in state.
Initialize Terraform Backend
Run `terraform init` in the directory containing your configuration. This downloads the Google provider plugin and configures the backend. If using a GCS backend, Terraform will attempt to access the bucket; ensure the service account has `storage.objects.*` permissions on the bucket. The command creates a `.terraform` directory with provider binaries and lock files. If the backend configuration changes, you must re-run `init`.
Write HCL Configuration Files
Define resources in `.tf` files using HashiCorp Configuration Language (HCL). Each resource block specifies a type (e.g., `google_compute_instance`), a local name, and configuration arguments. Use variables for dynamic values and outputs to expose resource attributes. Follow a consistent naming convention and use modules for reusable components. Validate syntax with `terraform validate`.
Create Execution Plan
Run `terraform plan` to generate an execution plan. Terraform compares the current state (stored in the backend) with your configuration and determines what resources to create, modify, or delete. The plan output shows actions with `+` (create), `~` (update in-place), `-` (destroy). Review the plan carefully to ensure no unintended changes. Use `-out=tfplan` to save the plan for later apply.
Apply Configuration to GCP
Run `terraform apply` to execute the plan. Terraform makes API calls to GCP to create, update, or delete resources. It updates the state file after each successful operation. If an error occurs (e.g., quota exceeded), Terraform will attempt to roll back changes but may leave partial state. Use `-auto-approve` only in CI/CD pipelines. After apply, verify resources in GCP Console.
Manage State and Drift
Periodically run `terraform plan` to detect drift — changes made outside Terraform. If drift is found, you can either update the configuration to match the actual state (import the resource with `terraform import`) or apply to revert changes. Use `terraform state list` to view resources in state and `terraform state show` for details. Never edit the state file manually; use `terraform state` commands.
Enterprise Scenario 1: Multi-Environment Deployment
A large e-commerce company uses Terraform to manage GCP infrastructure across dev, staging, and production environments. They have a dedicated GCS bucket for state with versioning and a folder structure: gs://company-tf-state/env/{dev,staging,prod}/. Each environment has its own root module that calls shared networking and compute modules. The CI/CD pipeline (Cloud Build) runs terraform plan on pull requests and terraform apply only on merges to the main branch. They use service accounts with minimal permissions for each environment. A common issue is that developers sometimes manually create resources in the dev project, causing drift. They mitigate this by running a scheduled Cloud Function that triggers terraform plan daily and alerts on drift.
Enterprise Scenario 2: Centralized State with Team Collaboration
A financial services firm has a team of 10 engineers managing GCP resources. They use a shared GCS backend with state locking to prevent concurrent modifications. The bucket is in a separate 'admin' project with strict IAM policies: only the CI/CD service account has write access; engineers have read-only access to state. They use terraform_remote_state data sources to share outputs between teams (e.g., network team outputs VPC IDs, app team consumes them). Performance is a concern: with over 1000 resources in state, terraform plan takes several minutes. They use the -parallelism=20 flag and split state into multiple files per microservice.
Scenario 3: Disaster Recovery with Terraform
A healthcare company uses Terraform to replicate their GCP infrastructure to a secondary region for disaster recovery. They maintain two separate state files in the same bucket: state/primary and state/dr. The DR configuration is identical except for the region and smaller instance sizes. In a disaster, they run terraform apply on the DR state to spin up resources. They test this quarterly by destroying and recreating the DR environment. A common mistake is forgetting to update the backend configuration in the DR folder, causing it to write to the primary state. They enforce this with a pre-commit hook that validates the backend prefix.
What the ACE Tests
Objective 3.1 – Deploy Implement: 'Implement Infrastructure as Code (IaC) using Terraform to manage GCP resources.' The exam tests your ability to:
Configure a remote backend (GCS) with proper IAM.
Write and organize Terraform configurations for GCP services.
Use modules and variables.
Manage state and detect drift.
Apply security best practices (secrets, IAM).
Common Wrong Answers and Traps
Storing state locally: A question may describe a team that cannot apply from multiple machines. The obvious wrong answer is 'use a shared network drive' – the correct answer is 'use a GCS backend with state locking.'
Hardcoding secrets: A scenario shows a variable with a default value containing a password. The trap answer is 'mark it as sensitive' – but sensitive only hides output; the secret is still in the config file. The correct answer is 'use a secrets manager like Google Secret Manager or environment variables.'
Using `terraform import` for drift: When a resource is created manually, candidates often think terraform import is the only fix. Actually, you can also update the config to match the resource and run terraform apply to adopt it. The exam tests both approaches.
Provider version: A question may show a provider block without a version constraint. The trap answer is 'it will use the latest version' – but without a constraint, Terraform will use the version specified in the required_providers block or the latest if not specified, which can cause breaking changes. Always pin versions.
Specific Values and Terms
Default parallelism: 10.
GCS backend bucket must have versioning enabled for state recovery.
terraform validate checks syntax but not credentials.
terraform fmt rewrites HCL to canonical format.
Workspaces create separate state files with prefix env:/workspace_name/.
Edge Cases
State locking: If a user runs terraform apply and their machine crashes, the state lock remains. To unlock, use terraform force-unlock <lock_id>.
Sensitive outputs: Marking an output as sensitive = true prevents it from being displayed after apply, but it is still stored in state. Use output with sensitive = true and fetch via terraform output -json.
Destroying resources: terraform destroy removes all resources in state. To destroy only specific resources, use terraform destroy -target=resource_type.name.
How to Eliminate Wrong Answers
If the question involves state, look for 'remote backend' or 'GCS' in answers.
If secrets are mentioned, eliminate any answer that suggests putting them in config files.
If multiple environments, look for separate state files or workspaces.
If drift, look for 'terraform plan' or 'import'.
Always use a remote backend (GCS) for state storage in production.
Enable versioning on the GCS state bucket to recover from corruption.
Never hardcode secrets in Terraform configuration; use Google Secret Manager or environment variables.
Pin provider and module versions using version constraints.
Use `terraform plan` as a mandatory review step before `apply`.
Organize configurations using modules and separate state files per environment.
Mark sensitive variables and outputs with `sensitive = true` to prevent CLI exposure, but remember state is still plaintext.
These come up on the exam all the time. Here's how to tell them apart.
Local State
Stored in `terraform.tfstate` in the working directory
No locking mechanism; concurrent applies can corrupt state
Not accessible by other team members
No versioning or backup
Suitable only for personal testing
Remote State (GCS)
Stored in a GCS bucket; supports versioning and encryption
Supports state locking via GCS object locking
Shared across team; enables collaboration
Can be integrated with CI/CD pipelines
Required for production deployments
Mistake
Terraform state can be stored in a Git repository safely.
Correct
State files often contain sensitive data like instance IPs, service account keys, or database passwords. Even if encrypted, Git history retains old versions. Always use a remote backend like GCS with encryption and access controls.
Mistake
Terraform automatically detects and reverts manual changes to resources.
Correct
Terraform only detects drift when you run `terraform plan`. It does not automatically revert changes. You must manually run `terraform apply` to restore the desired state.
Mistake
Using `sensitive = true` on a variable encrypts the value in state.
Correct
Marking a variable as `sensitive = true` only prevents its value from being displayed in CLI output. The value is still stored in plain text in the state file. Use a secrets manager for actual encryption.
Mistake
You can have multiple Terraform configurations writing to the same state file without issues.
Correct
Concurrent writes to the same state file cause corruption. Always use a remote backend with locking (GCS supports object-level locking). Each configuration should have its own state file.
Mistake
The `terraform validate` command checks that your credentials are valid.
Correct
`terraform validate` only checks syntax and internal consistency. It does not make any API calls. To verify credentials, run `terraform plan` which requires authentication.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Create a GCS bucket with versioning enabled. In your Terraform configuration, add a `backend "gcs"` block inside `terraform {}` specifying the bucket name and optional prefix. Run `terraform init` to migrate state. Ensure your credentials have `storage.objects.*` permissions on the bucket.
`terraform plan` creates an execution plan showing what changes will be made without actually applying them. `terraform apply` executes the plan. In CI/CD, you typically run `plan` first, review, then `apply`.
Yes, use `terraform import` to bring existing resources under Terraform management. You must write a configuration block that matches the resource, then run `terraform import <resource_type>.<name> <resource_id>`. The resource is then part of the state.
Use environment variables prefixed with `TF_VAR_` (e.g., `TF_VAR_api_key`) or use the `google_secret_manager_secret_version` data source to fetch secrets at runtime. Never hardcode secrets in `.tf` files.
The next `terraform plan` will show that the resource needs to be recreated. Running `terraform apply` will recreate it. If you want to accept the deletion, remove the resource block from config and run `terraform apply` to update state.
Use separate directories with their own backend configuration pointing to different prefixes in the same GCS bucket (e.g., `prefix = "env/dev"`). Alternatively, use Terraform workspaces: `terraform workspace new dev` creates a separate state file.
It checks that the configuration is syntactically valid and internally consistent (e.g., referencing variables that exist). It does not make API calls, so it cannot verify credentials or resource availability.
You've just covered Terraform on GCP Best Practices — now see how well it sticks with free ACE practice questions. Full explanations included, no account needed.
Done with this chapter?