Configuration Drift
Terraform assumes it is the single source of truth for the infrastructure it manages. Configuration drift happens when that assumption breaks — when someone changes a real resource outside of Terraform, so the live infrastructure no longer matches what state and configuration describe. A teammate edits a security group in the AWS console, an auto-scaling event resizes a group, or an incident responder hotfixes a setting at 3 a.m. The next terraform plan will reveal a difference that nobody wrote into the .tf files, and learning to detect and reconcile that difference is essential to keeping infrastructure predictable.
How drift happens
Drift is almost always a side effect of out-of-band changes — modifications made through a channel other than terraform apply. The most common sources are:
- Manual console edits — someone toggles a setting in the AWS, Azure, or GCP web console.
- CLI or SDK scripts — an operator runs
aws ec2 modify-instance-attributedirectly. - Other automation — a separate tool, a Lambda, or a provider-side default mutates the resource.
- Provider-managed changes — autoscaling, certificate rotation, or a cloud service adjusting a value on its own.
Whatever the cause, the result is the same: the cloud object’s real attributes diverge from the values recorded in state and declared in configuration.
Detecting drift with plan and refresh
Before computing changes, terraform plan performs a refresh: it queries the provider API for the current state of every managed resource and updates its in-memory copy of state. It then compares three things — your configuration, the refreshed real-world values, and the prior state — and reports any divergence.
Suppose you declared an instance with monitoring disabled:
resource "aws_instance" "api" {
ami = "ami-0c7217cdde317cfec"
instance_type = "t3.micro"
monitoring = false
vpc_security_group_ids = [aws_security_group.api.id]
tags = {
Name = "devcraftly-api"
}
}
If someone enables detailed monitoring in the console, the next plan surfaces it:
terraform plan
Output:
aws_instance.api: Refreshing state... [id=i-0abc123def4567890]
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the
last "terraform apply" which may have affected this plan:
# aws_instance.api has changed
~ resource "aws_instance" "api" {
id = "i-0abc123def4567890"
~ monitoring = true -> false
# (rest unchanged)
}
Plan: 0 to add, 1 to change, 0 to destroy.
The “Objects have changed outside of Terraform” banner is the signal that drift was detected. Terraform plans to set monitoring back to false to match your configuration.
For a read-only audit, run
terraform plan -detailed-exitcode. It returns exit code0for no changes,2when a diff (including drift) exists, and1on error — perfect for a scheduled CI job that alerts when reality has drifted.
You can refresh state without producing a change plan using -refresh-only, which updates state to match reality and shows you what moved, but does not propose altering any resource:
terraform plan -refresh-only
terraform apply -refresh-only
OpenTofu behaves identically here — tofu plan, -detailed-exitcode, and -refresh-only all work the same way.
Reconciling drift
Once you know infrastructure has drifted, you have two valid responses, and choosing the right one depends on whether the manual change was a mistake or an improvement.
| Strategy | When to use it | How |
|---|---|---|
| Revert to configuration | The manual change was unwanted; your .tf files are still correct. | Run terraform apply to push the resource back to the declared values. |
| Adopt the change | The manual change is desirable and should be permanent. | Update your .tf files to match reality, then apply so config and state agree. |
| Accept new external values | A field is now provider-managed (e.g. autoscaling capacity). | Use terraform apply -refresh-only to record reality, and ignore_changes to stop fighting it. |
For the third case, lifecycle.ignore_changes tells Terraform to leave a given attribute alone even when it drifts:
resource "aws_autoscaling_group" "workers" {
name = "devcraftly-workers"
min_size = 2
max_size = 10
desired_capacity = 2
lifecycle {
ignore_changes = [desired_capacity]
}
}
Here an autoscaler legitimately changes desired_capacity, so Terraform stops treating that as drift while still managing the rest of the group.
Preventing drift
The most reliable cure is to make out-of-band changes impossible, or at least painful enough that nobody reaches for them. The core idea is to lock down direct access so that terraform apply is the only path to change.
- Restrict console and write API access with IAM. Grant humans read-only access in production and reserve mutating permissions for the Terraform execution role used by CI.
- Apply only through CI/CD so every change is reviewed, planned, and recorded in version control.
- Schedule drift detection — a nightly
terraform plan -detailed-exitcodethat alerts on exit code2.
terraform plan -detailed-exitcode -out=drift.tfplan
echo "exit code: $?" # 2 means drift was found
Output:
No changes. Your infrastructure matches the configuration.
exit code: 0
Best practices
- Run
terraform planbefore every apply and read the “changes made outside of Terraform” section carefully. - Automate drift detection in CI with
-detailed-exitcodeso divergence is caught within hours, not weeks. - Decide deliberately whether to revert drift or codify it — never let an unexplained diff sit unresolved across runs.
- Lock down provider write access with IAM so
terraform applyis the only sanctioned way to change infrastructure. - Use
lifecycle { ignore_changes = [...] }for fields that are legitimately managed outside Terraform (autoscaling capacity, rotated secrets). - Keep state in a remote backend with locking so concurrent applies can’t introduce their own drift.