Immutable Infrastructure
Immutable infrastructure is the practice of never modifying a running server after it is provisioned. Instead of SSHing in to patch a package or edit a config, you build a fresh image, launch new instances from it, and discard the old ones. This eliminates configuration drift, makes rollbacks trivial, and means the infrastructure you test is byte-for-byte the infrastructure you ship. Terraform pairs naturally with this model: it declares the desired end state and replaces resources whose inputs have changed.
Mutable versus immutable
In a mutable model, a long-lived server accumulates changes over its lifetime — manual hotfixes, ad-hoc package upgrades, drifting kernel versions. Two machines that started identical slowly diverge, and reproducing a bug becomes guesswork. The immutable model treats servers as disposable: any change produces a new artifact, and the old one is destroyed.
| Concern | Mutable (in-place) | Immutable (replace) |
|---|---|---|
| Apply a change | Patch the live host | Build new image, swap instances |
| Configuration drift | Accumulates over time | Impossible — hosts are never edited |
| Rollback | Reverse the change manually | Re-deploy the previous image |
| Reproducibility | Hard — state is implicit | Exact — image is the source of truth |
| Debugging | Inspect the broken host | Inspect the build that produced it |
Golden images with Packer
A golden image is a pre-baked machine image containing the OS, runtime, dependencies, and application code — everything needed to boot a ready-to-serve instance. HashiCorp Packer builds these images deterministically from a template, so the same definition produces the same AMI every time.
# app.pkr.hcl
packer {
required_plugins {
amazon = {
version = ">= 1.3.0"
source = "github.com/hashicorp/amazon"
}
}
}
source "amazon-ebs" "app" {
ami_name = "devcraftly-app-{{timestamp}}"
instance_type = "t3.micro"
region = "us-east-1"
source_ami_filter {
filters = {
name = "al2023-ami-*-x86_64"
virtualization-type = "hvm"
root-device-type = "ebs"
}
owners = ["amazon"]
most_recent = true
}
ssh_username = "ec2-user"
}
build {
sources = ["source.amazon-ebs.app"]
provisioner "shell" {
inline = [
"sudo dnf install -y nginx",
"sudo systemctl enable nginx",
]
}
provisioner "file" {
source = "./dist/"
destination = "/tmp/app"
}
}
Build it once and capture the resulting AMI id:
packer build app.pkr.hcl
Output:
==> amazon-ebs.app: Creating AMI devcraftly-app-1718323200 from instance i-0a1b2c3d4e5f
==> amazon-ebs.app: AMI: ami-0fe1c2d3b4a5e6f70
Build 'amazon-ebs.app' finished after 4 minutes 12 seconds.
==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs.app: AMIs were created:
us-east-1: ami-0fe1c2d3b4a5e6f70
Terraform then consumes that image. Looking it up by tag keeps the workflow decoupled — Packer publishes, Terraform discovers.
data "aws_ami" "app" {
most_recent = true
owners = ["self"]
filter {
name = "name"
values = ["devcraftly-app-*"]
}
}
resource "aws_launch_template" "app" {
name_prefix = "app-"
image_id = data.aws_ami.app.id
instance_type = "t3.micro"
}
Replacing instead of mutating
When the AMI id changes, Terraform’s default behaviour is to destroy then create the affected resource — a brief outage where nothing is serving traffic. For anything fronting users, invert that with create_before_destroy so the replacement is healthy before the old resource is torn down.
resource "aws_autoscaling_group" "app" {
name_prefix = "app-"
min_size = 3
max_size = 9
desired_capacity = 3
vpc_zone_identifier = var.private_subnet_ids
launch_template {
id = aws_launch_template.app.id
version = aws_launch_template.app.latest_version
}
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 90
}
}
lifecycle {
create_before_destroy = true
}
}
Because the ASG name uses name_prefix rather than a fixed name, Terraform can stand up the new group alongside the old one — fixed names would collide and force a destroy-first replacement. The instance_refresh block rolls instances onto the new launch template version gradually, keeping 90% of capacity healthy throughout.
Tip:
create_before_destroypropagates to dependencies. If a resource referenced by an immutable resource lacks the same lifecycle setting, Terraform may still try to destroy it first and stall. Apply the lifecycle block consistently down the dependency chain.
A plan after a new image build shows the replacement clearly:
terraform plan
Output:
# aws_launch_template.app will be updated in-place
~ resource "aws_launch_template" "app" {
~ image_id = "ami-0aa11bb22cc33dd44" -> "ami-0fe1c2d3b4a5e6f70"
~ latest_version = 7 -> 8
}
Plan: 0 to add, 1 to change, 0 to destroy.
The launch template version bumps, and the ASG’s instance_refresh carries the change to running instances on the next apply.
Note: This entire workflow is provider-agnostic at the Terraform layer and runs unchanged on OpenTofu —
aws_ami,aws_launch_template, andlifecycleare core resources and meta-arguments, not Terraform-specific extensions.
Why it improves reliability
Because every deploy boots a known image, the gap between staging and production collapses. Failed deploys are recovered by pointing the launch template back at the previous AMI and applying — no forensic surgery on a wedged host. Capacity scales horizontally from a single trusted artifact, and security patches ship as new images rather than fleet-wide live edits, so an interrupted patch can never leave half-configured machines.
Best Practices
- Bake everything into the image at build time; reserve runtime user-data for small, environment-specific values like secrets or region.
- Tag and version every image (timestamp or git SHA) so Terraform can pin or roll back to an exact artifact.
- Use
name_prefixover fixed names on launch templates and ASGs to enablecreate_before_destroy. - Drive replacements through
instance_refreshor rolling deployments so capacity stays healthy during a swap. - Never
terraform apply -replacea production host as a fix — rebuild the image and let the normal pipeline roll it out. - Keep Packer templates in version control alongside Terraform so the image definition and its consumers evolve together.