Terraform is an extremely powerful tool. It can be a lot of fun and truly empowering but like any tool it can be misused — usually unintentionally — in ways that can lead to tremendous toil and calamity. One of those ways is the slow and gradual birthing of a Mega-Module. That is, a module so huge it can take hours to run Terraform Apply.

Alt text

I know what you are expecting to hear from me, ‘Surely, a Terraform expert such as yourself has never committed such a horrendous offense! Surely the author of ‘Mastering Terraform’ would have the foresight to avoid such a quagmire!’ But you’d be wrong!

Nobody starts out as a master of anything (unless you’re Rey Palpatine I guess) but for most of us — mere mortals — we can only hope to achieve mastery through practice and well earned scars through the process.

One of the most euphoric experiences when using Terraform is to write the code for your entire solution and in one run of Terraform Apply have a fully working application. The idea of starting from absolutely nothing to achieving greatness is somehow ingrained in the human psyche. It’s the classic “rags to riches” story but Cloud Architect edition. The outcome is very alluring and while things are small and simple — it’s okay. It’s when things get bigger that this understandable urge and NorthStar becomes an absolute siren song.

Blast radius is how we prevent ourselves from smashing into the rocks surrounding the isle of ‘Anthemoessa’. A good sign you have not considered blast radius in has your root module design would be the following:

  1. 30+ minutes to run Terraform Apply
  2. TFVARS file for a single environment that is 1000s of lines long
  3. Terraform State file > (or approaching) 100MB
  4. People on your team culturally transforming into Hobbits and suddenly recognizing 7 meals a day to keep themselves busy during Terraform Apply (Breakfast, Second Breakfast, Elevenses, Luncheon, Afternoon tea, etc)

Blast radius is a design guideline of keeping modules “right sized”. The goal is not small, the goal is Goldilocks. To achieve this you need to start being thoughtful about whether a piece of infrastructure “belongs” with other pieces of infrastructure. This decision could be influenced by a number of factors such as hard dependency (there is a technical relationship between them), functional dependency (they support the same app or service), organizational responsibility (who owns it), risk (what happens if this thing gets borked?), time to live (how quickly can we kill this thing and bring it back if we need to?), etc.

Start doing this now, for every new resource your team is thinking of adding to ANY terraform deployment you guys have. This will stop the bleeding.

Now for the surgery.

You’re gonna have to take a hard HARD look at this big mamma module you have. White board this sucker out completely. This will force you to put related things next to each other and connect the dots. Make sure the picture is comprehensible by normal humans. Have somebody come in and look at it with fresh eyes and attempt to explain it to them. This will start rattling those decisions around in your brain about whether stuff is related or not using the many reasons I described above. once you’re done starting carving it like the diagram of a bovine at a steakhouse.

Alt text

Each area your carve out is going to be a new root module. Give it a name and make sure you articulate this new root modules responsibility. Each should have an “ethos”. What does it do? How is its job different from other root modules jobs?

Once you like your plan. Make preparations to refactor the code. Start by carving out less risky parts first and rinse and repeat. The big mamma module will gradually become less and less of a pig every time you branch off a new root modules.

To make this process easier, consider using an automation tool to generate the remove and import blocks, as manually handling them can be quite tedious. The good news is, you don’t need to tackle everything at once. You can break it down into manageable chunks, and each piece you refactor will immediately reduce the strain — albeit incrementally.

In conclusion, while the temptation to create all-encompassing Terraform modules is understandable, the hidden costs of these Mega Modules in terms of apply-time, maintenance complexity, and operational risks are too great to ignore. By recognizing the signs of an overgrown module and taking deliberate steps to break it down, you can streamline your deployments, reduce blast radius, and set yourself up for more manageable infrastructure as your projects evolve. Refactoring into right-sized, focused modules isn’t just a one-time fix; it’s a practice that ensures your infrastructure remains scalable and maintainable for the long term.

Take a moment to review your largest Terraform module today. Does it fall into any of the traps discussed here? Start thinking about what I’ve said and thinking through what“surgery” might be needed. Start by identifying which pieces should be split off and begin breaking them out one at a time. Trust me, your future self — and your team — will thank you.

Happy Azure Terraforming!