Max's notebook

A collection of sorts


Some Thoughts on Terraform CI for Monorepos

08 Jan 2022

Continuous integration and deployment for terraform monorepos is not a solved problem. I’m not proposing to solve it, but this is a record of my thoughts and experiments.

As an aside, CI for terraform stand-alone repos is, in fact, a very solved problem. See “Automate Terraform with GitHub Actions” in the References section for a very approachable example.

The Problem Space

We have a repo storing multiple terraform configurations or stacks (databases, users, kubernetes clusters, etc), and we want to organize it in a way that supports continuous integration and deployment.

Constraints

Using Terraform

To do this using terraform, we had two options:

Using Terragrunt

Terragrunt allows full access to all of the features of terraform while helping to address some of these concerns: maintain one workflow per level-of-abstraction, using terragrunt run-all to plan and apply resources while dynamic backend generation ensures separate state for each module. This is the route we chose in the end.

Final repo structure

├── prod
│   ├── app1
│   │   ├── main.tf
│   │   └── terragrunt.hcl
│   ├── app2
│   │   ├── main.tf
│   │   └── terragrunt.hcl
│   ├── cache
│   │   ├── main.tf
│   │   └── terragrunt.hcl
│   ├── database
│   │   ├── main.tf
│   │   └── terragrunt.hcl
│   └── terragrunt.hcl
└── staging
    ├── app1
    │   ├── main.tf
    │   └── terragrunt.hcl
    ├── app2
    │   ├── main.tf
    │   └── terragrunt.hcl
    ├── cache
    │   ├── main.tf
    │   └── terragrunt.hcl
    ├── database
    │   ├── main.tf
    │   └── terragrunt.hcl
    └── terragrunt.hcl

What Worked Well

Having a single point of entry makes it very easy to understand the changes getting rolled out to each environment, and we can (and do) include an environment-level plan for drift-detection during code review. Also, not needing to manage backend or provider configurations for each stack by hand is really nice.

What Didn’t Work so Well

There’s no consistent path for a stack to get from staging to production. I really like Kief Morris’ pipeline-per-stack model (read more about it in the “Using Pipelines to Manage Environments with Infrastructure as Code” article, linked in the References section), but I found a few drawbacks for our use-case:

What’s Next?

There are many other features that terragrunt has, like dependency blocks, which provide more in-depth configuration options (and remove the need for many data blocks) that I’m excited to explore. On the CI front, I’m eagerly awaiting updates on Hashicorp’s testing experiment, and whatever usability improvements we come across as we put more and more pressure on the current CI pattern.

References

RSS