Common Mistakes & Gotchas
Most Terraform incidents are not caused by exotic edge cases — they come from a small set of recurring mistakes that bite teams again and again. State that lives on one laptop, secrets baked into a plan, resources that get reshuffled because of count, and providers that silently upgrade overnight are all avoidable with a little discipline. This page catalogs those traps in a problem → fix format so you can recognize them before they cost you a production outage. Everything here applies equally to Terraform 1.5+ and OpenTofu, which share the same HCL2 language and state model.
Local or committed state
Problem: The default backend writes terraform.tfstate to your working directory. On a team, that means state lives on whoever ran apply last — and worse, it sometimes gets committed to Git, exposing every output value (including secrets) and guaranteeing conflicts when two people apply at once.
Fix: Use a remote backend with locking from day one. For AWS, S3 with native lockfile-based locking (Terraform 1.10+) or a DynamoDB lock table is the standard.
terraform {
backend "s3" {
bucket = "acme-tf-state"
key = "prod/network/terraform.tfstate"
region = "us-east-1"
encrypt = true
use_lockfile = true # native S3 locking, no DynamoDB needed
}
}
Then make sure local state can never be committed:
echo "*.tfstate*" >> .gitignore
echo ".terraform/" >> .gitignore
Treat state as sensitive data. It contains every resource attribute in plaintext, including database passwords and generated keys. Always enable
encrypt = trueand restrict bucket access by IAM policy.
Hand-editing the state file
Problem: When a deployment drifts, it is tempting to open terraform.tfstate in an editor and “fix” a resource ID by hand. State is a precise JSON structure with serial numbers and checksums; one wrong edit corrupts it and the next plan either errors or proposes destroying live infrastructure.
Fix: Never edit state JSON directly. Use the purpose-built CLI subcommands, which validate and version the changes for you.
# Move a resource to a new address after refactoring
terraform state mv aws_instance.web aws_instance.app
# Import an existing resource Terraform doesn't track yet
terraform import aws_s3_bucket.logs acme-app-logs
# Remove a resource from state without destroying it
terraform state rm aws_instance.legacy
Better still, prefer declarative moved and import blocks so the change is reviewed in a PR and survives across machines:
moved {
from = aws_instance.web
to = aws_instance.app
}
Secrets in code or state
Problem: Hardcoding password = "hunter2" puts the secret in version control forever. Even when you pass it as a variable, the value still lands in the state file in plaintext.
Fix: Source secrets from a manager at apply time and never set defaults for sensitive inputs. Mark variables sensitive so they are redacted from plan output.
data "aws_secretsmanager_secret_version" "db" {
secret_id = "prod/db/password"
}
variable "db_password" {
type = string
sensitive = true
}
resource "aws_db_instance" "main" {
identifier = "prod-db"
engine = "postgres"
instance_class = "db.t3.medium"
username = "appuser"
password = data.aws_secretsmanager_secret_version.db.secret_string
skip_final_snapshot = false
}
The state still stores the resolved value, so protect the backend with encryption and tight IAM — that is the real boundary.
Using count where for_each belongs
Problem: count indexes resources by integer position. Remove the middle item from a list and every later resource shifts down by one — Terraform plans to destroy and recreate them all.
Fix: Use for_each over a map or set so each instance has a stable, name-based key.
# Fragile: removing "staging" recreates "prod"
resource "aws_ssm_parameter" "env_count" {
count = length(var.envs)
name = "/config/${var.envs[count.index]}"
type = "String"
value = "active"
}
# Stable: each key is independent
resource "aws_ssm_parameter" "env" {
for_each = toset(var.envs)
name = "/config/${each.value}"
type = "String"
value = "active"
}
| Aspect | count | for_each |
|---|---|---|
| Addressing | [0], [1] … | ["prod"], ["staging"] |
| Stable on removal | No — indices shift | Yes — keys are independent |
| Best for | Identical, ordered copies | Distinct named instances |
Unpinned providers and modules
Problem: Without version constraints, terraform init pulls the newest provider or module on every fresh checkout. A breaking change ships upstream and your CI starts proposing destructive plans — for code you never touched.
Fix: Pin providers with a required_providers block, commit .terraform.lock.hcl, and pin module sources to a tag or commit.
terraform {
required_version = ">= 1.5"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.60" # allow 5.60.x patches, block 6.0
}
}
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.13.0" # exact pin for shared modules
}
Applying without reviewing the plan
Problem: Running terraform apply -auto-approve blindly is how teams accidentally drop a database. The plan is your only chance to catch a destroy before it happens.
Fix: Always generate a saved plan, review it, and apply that exact artifact. In CI, post the plan for human approval before apply.
terraform plan -out=tfplan
terraform apply tfplan
Output:
Plan: 1 to add, 0 to change, 1 to destroy.
# aws_db_instance.main must be replaced
-/+ resource "aws_db_instance" "main" {
~ engine_version = "15.4" -> "16.2" # forces replacement
}
That 1 to destroy is exactly the line a review catches and -auto-approve would have run.
Click-ops drift
Problem: Someone tweaks a security group rule in the AWS console “just this once.” Terraform no longer matches reality, and the next apply silently reverts the change — or fails confusingly.
Fix: Detect drift early and decide deliberately. Run terraform plan (or -detailed-exitcode in CI) on a schedule and reconcile by importing legitimate changes or reverting unauthorized ones.
# Exit 2 means drift detected — fail the scheduled job
terraform plan -detailed-exitcode
Lock down console write access for the resources Terraform owns so the only path to change them is through code review.
Best Practices
- Use a locking remote backend with encryption before you create your first real resource.
- Never edit state by hand — reach for
state mv,import, andmoved/importblocks instead. - Keep secrets out of code; resolve them from a secrets manager and mark variables
sensitive. - Prefer
for_eachovercountfor any collection that can change membership. - Pin provider and module versions and commit
.terraform.lock.hcl. - Always review a saved plan; reserve
-auto-approvefor ephemeral, throwaway environments. - Run scheduled drift detection and restrict console access to Terraform-managed resources.