Applying DI to IaC (Terraform edition)

Use Dependency Injection principles in Terraform modules to improve code organization, reusability, and maintainability by making dependencies explicit and centralizing shared configurations.

Background

Terraform and its configuration language HCL are one of the choices available for defining Infrastructure as Code. In my experience, starting with Terraform typically involves setting up a root module with and other conventional files (like variables.tf, outputs.tf, provider.tf, …) and iteratively developed the code and infrastructure (hopefully in a separate IaaS organization, account or environment from production).

When the code in and in the root module become too much or too repetitive, the usual approach is to stow away some code in a child module, drilling down external configuration as needed. This could happen multiple times vertically (nested child modules) or horizontally (additional child modules).

If this approach helps reduce cognitive load and code repetition, it still has the disadvantage of hiding shared configuration and magic strings that are typical when defining infrastructures. In addition Terraform and HCL do not provide a way for global values to be reused across modules. Another issue is that it hides dependencies between modules.

A structured approach for defining terraform modules

Even if HCL is not the typical language programming language, as it is declarative and not Turing complete, we could still make use of Dependency Injection to make evident and top level dependencies between modules, and separating the concern of defining modules and using them. Following the definition in the Martin Fowler’s article Inversion of Control Containers and the Dependency Injection pattern the dependency injection form that we can use is Constructor Injection, based on HCL declarative style.

An example

Let’s define the infrastructure on AWS to run a few Lambda functions, with access to a DynamoDB to write and read, and access to a S3 bucket, behind an API Gateway. This example is taken from Infra as Code Learning Journey Thoughtworks Training.

Applying Dependency Injection requires clearly identifying and defining modules and their inputs outputs.

  • S3 Website: it creates a bucket and policy to read from it
    • Inputs: prefix from which the bucket name is derived, files to upload.
    • Outputs: the ARN of the policy.
  • User DB: creates the database and the policies with permission to Read, Write and Scan items.
    • Inputs: table name
    • Outputs: ARN of created policies
  • Lambda functions: it deploys the lambdas starting from the source code, creating logs, and assumed roles
    • Inputs: source file name (as handlers map key), policies to bind to lambda roles, environment variables.
    • Outputs: functions ARN.
  • API Gateway:
    • Inputs: Routing that matches HTTP Method and path to lambdas

Then the root module has the responsibility to orchestrate them all: instantiating the child modules, passing top level properties, constants or outputs from other modules.

In the following block of code it is possible to notice that dependencies between modules are evident and brought at the top level, detailed configurations for the resources are encapsulated in the modules as implementation details, magic strings that need to match are in the same location (e.g. verify_user) and not spread across the codebase.

# ./main.tf (extract)

module "static_content" {
  source = "./static_content"
  prefix = var.prefix
  files_to_upload = fileset(path.cwd, "src/static/*.html")
}

module "users_db" {
  source = "./users_db"
  table_name = local.users_table
}

module "handlers" {
  source = "./handlers"
  handlers = {
    …
    verify_user = {
      policies = [ module.static_content.read_files_policy_arn,
                  module.users_db.read_item_policy_arn],
      environment_variables = {
                              WEBSITE_S3 = module.static_content.bucket,
                              DB_TABLE_NAME = local.users_table
      }
	} 
  }
}

module "api" {
  source = "./api"
  routes = {
    verify_user = {
                  method = "GET", path = "/",
                  functions_arn = module.handlers.functions_arn["verify_user"]
   }
    …
  }
}

Notice how outputs from a module, e.g. module.static_content.bucket, flow as input in other modules e.g. handlers.environment_variables.WEBSITE_S3.

Differently from the typical dependency injection, string references to resources created by modules are injected as dependency instead of directly the resources or the modules (e.g. ARN - Amazon Resource Names - when working with AWS). Modules and Terraform resources are a reflection of the actual resources created in the cloud provider or other service provided by Terraform, and the Terraform State is what ties Terraform resources to real world objects.

Conclusion

By embracing dependency injection principles within your Terraform code, you gain several key advantages. You enhance code reusability, improve maintainability, and facilitate easier testing (yes Terraform code, and infrastructure creation could and should be tested in automation). By separating concerns and promoting modularity, you create a more robust and scalable infrastructure.

Rules of Thumb

  • Centralise Shared Configuration: Keep your shared configuration at the top level of your module hierarchy. This ensures consistency and simplifies management of global settings.
  • Isolate Implementation Details: Detailed configurations and resource-specific settings should be encapsulated within submodules. This promotes abstraction and reduces code complexity.

A Word of Caution

While dependency injection offers significant benefits, it's crucial to avoid over-engineering. Not every resource necessitates its own dedicated submodule. Overly abstracting or creating excessive submodules can lead to increased complexity and hinder readability. Strive for a balance between modularity and simplicity.

By carefully applying these principles, you can leverage dependency injection to build more efficient, maintainable, and testable Terraform infrastructure.

Bonus: refactoring

To keep healthy and well structured any project refactoring has to be part of the normal software development lifecycle.

Refactoring could require renaming Terraform modules or resources to better fit a new mental model, a new requirements, or simply to fix past errors or simplification while iterating.

Renaming resources is complicated by the Terraform state holding a cached version of the mapping between Terraform resources and actual resources (VPC, lambda, …) and is required for Terraform to work correctly, Hashicorp provides an explanation on why the state needs to exist in the current form.

The normal behaviour for Terraform would be to destroy and recreate a resource as the required configuration does not match anymore the deployed infrastructure, i.e. the new name is not mapped in the state to a real word item (it requires creating it) and the old resource is not required anymore (it destroys it). Sometimes this behaviour is not desirable or is not possible, Terraform provides ways to rename resources in the states using the moved block (from version 1.1) or the subcommand in the CLI https://developer.hashicorp.com/Terraform/language/modules/develop/refactoring .

Let’s assume that a resource has been previously created as and due to code refactoring the api module is now namaed , and to it is enough to add a moved block in the source code as.

moved {
  from = module.api.aws_apigatewayv2_integration.integrations["verify_user"]
  to = module.api_gw.aws_apigatewayv2_integration.integrations["get_user"]
}

A block like this allows Terraform to update the reference in the state and keep the real world resource without destruction and recreation. In a typical CI setup, the code change would be committed in the repository and apply the renaming when executed, with the added benefit of having a tracked history of renamed resources in the code versioning system.

Notes

Everything referencing Terraform, also applies to OpenTofu, which is a recent fork of Terraform.