Terraform Best Practices

Terraform makes it easy to provision infrastructure, but it does not stop you from building something fragile, insecure, or impossible to scale across a team. The practices below are the ones that separate a hobby project from production-grade infrastructure as code: they keep state safe, make changes reviewable, and let many engineers work on the same estate without stepping on each other. Everything here applies equally to Terraform 1.5+ and OpenTofu, which share the same HCL2 language and workflow.

Manage state remotely, locked, and encrypted

The state file is the single most important — and most dangerous — artifact Terraform produces. It maps your configuration to real-world resources and frequently contains secrets in plaintext. Never commit it to Git and never keep it only on a laptop.

Store state in a remote backend that supports locking so two concurrent apply runs cannot corrupt it, and ensure the backend encrypts data at rest. On AWS, an S3 backend with native locking (Terraform 1.10+) is the standard choice.

terraform {
  required_version = ">= 1.10"

  backend "s3" {
    bucket       = "acme-tf-state-prod"
    key          = "network/terraform.tfstate"
    region       = "us-east-1"
    encrypt      = true
    use_lockfile = true # S3-native state locking, no DynamoDB needed
  }
}

Treat state as sensitive data. Restrict bucket access with IAM, enable versioning so you can recover from a bad apply, and audit who can read it.

Structure code into modules and environments

Keep root configurations thin and push reusable logic into modules. A module should have a single responsibility and a clean input/output surface. Separate each environment (dev, staging, prod) so a change to dev can never accidentally target prod.

.
├── modules/
│   ├── vpc/
│   └── ecs-service/
└── environments/
    ├── dev/
    │   ├── main.tf
    │   └── terraform.tfvars
    └── prod/
        ├── main.tf
        └── terraform.tfvars

Reference shared modules with versioned sources rather than copy-pasting, and give every environment its own state key so blast radius is contained.

Handle variables and secrets correctly

Declare every input variable with a type, a description, and validation where it helps. Use .tfvars files per environment for non-secret configuration, and keep secrets out of variables entirely — pull them from a secret manager at plan time.

variable "instance_count" {
  description = "Number of app instances to run."
  type        = number
  default     = 2

  validation {
    condition     = var.instance_count >= 1 && var.instance_count <= 10
    error_message = "instance_count must be between 1 and 10."
  }
}

data "aws_secretsmanager_secret_version" "db" {
  secret_id = "prod/app/db-password"
}

resource "aws_db_instance" "app" {
  identifier     = "app-prod"
  engine         = "postgres"
  instance_class = "db.t3.medium"
  username       = "appuser"
  password       = data.aws_secretsmanager_secret_version.db.secret_string
}

Mark outputs that expose secrets as sensitive = true so they are redacted in plan output and the CLI.

Make the workflow safe: plan, review, apply

Never run apply blind. Generate a plan, review it, and apply that exact plan file so what you reviewed is what gets executed. In automation, run plan on every pull request and gate apply behind a merge to the main branch.

terraform plan -out=tfplan
terraform apply tfplan

Output:

Terraform will perform the following actions:

  # aws_db_instance.app will be created
  + resource "aws_db_instance" "app" {
      + engine         = "postgres"
      + identifier     = "app-prod"
      + instance_class = "db.t3.medium"
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Always run terraform fmt and terraform validate in CI to catch style drift and configuration errors before review.

Naming and tagging conventions

Adopt one naming convention and enforce it everywhere: lowercase, hyphen-separated, and prefixed with environment and application. Apply a consistent set of tags through provider default_tags so cost allocation and ownership are automatic.

provider "aws" {
  region = "us-east-1"

  default_tags {
    tags = {
      Environment = "prod"
      Application = "checkout"
      ManagedBy   = "terraform"
      Owner       = "payments-team"
    }
  }
}

Pin versions for reproducibility

Unpinned providers and modules turn a routine apply into a surprise upgrade. Constrain Terraform, providers, and module sources so the same code produces the same plan months later. Commit the .terraform.lock.hcl lock file.

Pin target	Where	Example
Terraform/OpenTofu	`required_version`	`">= 1.10, < 2.0"`
Provider	`required_providers`	`version = "~> 5.0"`
Module	`source` + `version`	`version = "3.2.1"`

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

Build security in from the start

Run least-privilege IAM for the credentials Terraform uses, and scan configurations for misconfigurations before they reach production. Tools like tfsec, Checkov, and OPA/Sentinel catch open security groups, unencrypted volumes, and public buckets in CI.

tfsec .
checkov -d .

Avoid hardcoded credentials, prefer short-lived OIDC-based authentication in pipelines, and enable deletion protection on stateful resources.

Best Practices

Keep state remote, locked, encrypted, and versioned — never in Git.
Split code into single-responsibility modules and isolate every environment’s state.
Pin Terraform, provider, and module versions and commit the lock file.
Pull secrets from a manager at runtime and mark sensitive outputs.
Require a reviewed plan before any apply, automated in CI/CD.
Enforce naming and tagging via default_tags and scan with tfsec/Checkov on every change.

Common Mistakes