Terraform feels simple at first. One file, a few resources, terraform apply and you're done. Then you add a second environment, a third AWS account, a fourth team member — and suddenly everything is broken and nobody knows why.
Module structure that scales
We use a three-layer module architecture. Foundation modules (VPC, IAM roles, DNS zones) are versioned separately and consumed by application modules. Application modules (ECS services, RDS clusters, load balancers) are composed into environment modules (dev, staging, production). Each layer has its own state file.
State management
Never use local state in a team. Use remote state from day one — S3 with DynamoDB locking for AWS. One state file per environment per component. The more you put in one state file, the larger your blast radius when something goes wrong. A terraform apply that touches 200 resources is terrifying. One that touches 12 is fine.
Workspaces are not environments
Terraform workspaces are widely misused as environment separation. They share the same code and make it easy to accidentally apply dev changes to production. Use separate directories with separate state files for separate environments. The duplication is worth the isolation.
Variable validation
Use validation blocks in variable declarations. Enforce CIDR ranges, instance type allowlists, and naming conventions at the Terraform layer, not after a failed apply. Fail fast with a clear error message rather than a cryptic AWS API error.
Plan in CI, apply with approval
Every pull request generates a terraform plan output posted as a PR comment. Engineers review the plan before merging. After merge, Atlantis or Terraform Cloud applies the plan automatically. The human approval step is the PR review — not a manual CLI command.
Drift detection
Schedule terraform plan runs daily against every environment. If the plan is non-empty, someone changed infrastructure outside of Terraform. Alert on this. Drift is how security incidents start — a manually created IAM role that nobody tracks, a security group rule added in the console.
Secrets
Never put secrets in Terraform state. Use data sources to read secrets from AWS Secrets Manager or Parameter Store at apply time. Terraform state is readable by anyone with state bucket access — treat it as a potential attack surface.