Clone and follow along in the project repository…
If you’re part of the masses using Terraform (or OpenTofu), you’ve likely heard of Terratest. In the project’s own words, Terratest is a Go library that provides patterns and helper functions for testing infrastructure. Terratest can seem daunting since it requires writing Go. The good news is you only need a very small part of Go’s surface area, but have the option of using it to its full potential if you need to. If you’ve wanted to get started with Terratest but haven’t had time, read on for a quick and painless intro!
Terratest leverages Go’s native testing library.
The first thing that means is you need Go installed and properly
configured. Luckily, installing Go has gotten easier over the years. Instead of
downloading releases and following manual instructions (you can still do that if you prefer),
it’s likely just a brew install go
away. Aside from the installation,
you’ll want to create a workspace (traditionally ${HOME}/go
) and
add some new environment variables to your shell profile.
Here’s mine:
❯ grep GO .zshrc
export GOPATH="${HOME}/go"
export GOBIN="${GOPATH}/bin"
export GOROOT="/usr/lib/go"
export PATH="${GOBIN}:${GOROOT}/bin:${PATH}"
test -d "${GOPATH}" || mkdir "${GOPATH}"
test -d "${GOPATH}/src/github.com" || mkdir -p "${GOPATH}/src/github.com"
For this project we’ll use the organization scheme
below – feel free to experiment and pick what works best for you.
There are a few common patterns you’ll see when browsing
community modules… tests are placed in ${PROJECT_ROOT}/test/src
,
follow a test_name_test.go
convention, and run against specific
configurations in ${PROJECT_ROOT}/examples
:
.
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── build
│ ├── Dockerfile
│ └── src
├── environments
│ └── dev
│ └── Makefile
├── examples
├── modules
│ └── network
│ └── web
└── test
├── Makefile
└── src
└── example_complete_test.go
└── etc.
We want to focus on Terratest vs Terraform, so our contrived example takes a number of shortcuts such as using the default VPC in your AWS account and Amazon’s DNS. To keep it slightly more realistic, we’ll break out a couple modules. You might use a community module to handle the network bits, an internal module from your network team that exposes custom network details, or add additional services to meet your requirements… This all plugs nicely into our example hierarchy. You can represent high level functionality as modules, which in turn may consume other modules for the heavy lifting, then customize how it all gets stitched together for each target environment.
One last aside before we jump in… The sample project is configured to use aws-vault, which is particularly useful when working across a lot of different accounts. It manages AWS-related environment details for you so you can focus on getting work done. It keeps credentials locked away in your OS' keychain and uses temporary credentials to access your infrastructure – a security win even if you only use a single account. You don’t have to use aws-vault to use Terratest, but the scripts we’ll be using assume it’s in place. Feel free to refactor to meet your personal tastes, or take a few minutes to get aws-vault installed before jumping into the examples.
Before we can test, we need a few lines of boilerplate to load any required modules and configure Terratest. Luckily once you work it out for one project it’s easy to turn into a template:
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestExamplesComplete(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../../examples/complete",
Upgrade: true,
VarFiles: []string{"fixtures.us-east-2.tfvars"},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
Note how the terraform import is under a terratest/modules
path. We’ll see
examples of using modules below, but it’s a hint at Terratest’s modular
approach (check out the full list in their repo,
or the module documentation).
Aside from the expected Terraform coverage, there are modules for your IaaS of choice, ways
to validate common DevOps tooling (Docker, k8s), as well as http and shell
modules which provide a lot of flexibility.
Technically you don’t need an assertion framework, but we follow the docs by pulling in testify. This makes tests a lot easier to read and write. You can easily swap this out if you have a preferred framework, or drop it entirely by using native comparison operations.
Next, we configure terraformOptions
, providing the path to
the code to test and passing any var files to be used (relative to TerraformDir
).
Lastly, in typical test fashion, we defer a cleanup operation to ensure we
don’t leave artifacts around (more on this below), and use InitAndApply
to run
terraform init
and terraform apply
as part of each test (there are
variations such as InitAndPlan
, Init
, Plan
and Apply
).
In our simple example we’ll just build one “complete” test covering all functionality. Typically you would have multiple tests, with fixtures customized accordingly to cover common use cases.
In our contrived example, we have a network module that discovers the default VPC and associated subnets. In the real world you might have complicated infrastructure you manage or shared infrastructure from another team that you simply consume. You need to utilize network components like VPCs and subnets to get your service deployed. Wouldn’t it be nice to confirm your sensitive production service deploys to the desired network?
Let’s write the simplest test we can… Since the example code selects the
default VPC in an attempt to run anywhere, we’ll start with a test that
confirms the returned VPC starts with vpc-
. This is easy to extend. For
example, you could ensure target VPCs have specific tags.
In classic red/green/refactor style, let’s write a test we know will fail:
// ...
vpcID := terraform.Output(t, terraformOptions, "vpc_id")
assert.Equal(t, "vpc-foobah", vpcID)
As expected, running our test returns:
TestExamplesComplete: examples_complete_test.go:37:
Error Trace: examples_complete_test.go:37
Error: Not equal:
expected: "vpc-foobah"
actual : "vpc-7a5ce123"
Diff:
--- Expected
+++ Actual
@@ -1 +1 @@
-vpc-foobah
+vpc-7a5ce123
Test: TestExamplesComplete
--- FAIL: TestExamplesComplete (293.52s)
FAIL
exit status 1
FAIL test 293.834s
It’s always nice to confirm things fail when expected… Let’s fix that:
import (
"strings"
// ...
)
// ...
vpcID := terraform.Output(t, terraformOptions, "vpc_id")
assert.True(t, strings.HasPrefix(vpcID, "vpc-"))
Note how we used the standard strings
library to extend our test. This is
generic enough it should match any returned VPC. Does it?
--- PASS: TestExamplesComplete (291.19s)
PASS
ok test 291.875s
Awesome, now we have confidence things work as expected. Just to make this a bit more interesting, let’s ensure the list of availability zones output by the network module match the region specified in our fixtures:
// ...
availabilityZones := terraform.OutputList(t, terraformOptions, "availability_zones")
for _, az := range availabilityZones {
assert.True(t, strings.HasPrefix(az, "us-east-2"))
}
Ideally we’ve added coverage, and everything still passes:
--- PASS: TestExamplesComplete (308.07s)
PASS
ok test 308.782s
One gotcha to be aware of, there is a lot of output when tests are running. I’ve purposefully zeroed in on the more informational parts. One section shows the outputs from the terraform run. Here’s an example:
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: Apply complete! Resources: 11 added, 0 changed, 0 destroyed.
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: Outputs:
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: availability_zones = [
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "us-east-2a",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "us-east-2b",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "us-east-2c",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ]
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: cloudwatch_log_group = /ecs/terratest-experiment-dev
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: dns_name = terratest-experiment-dev-2000011916.us-east-2.elb.amazonaws.com
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ecr_repository_url = 012345678901.dkr.ecr.us-east-2.amazonaws.com/terratest-experiment-dev
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: subnet_cidrs = [
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "172.31.32.0/20",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "172.31.0.0/20",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "172.31.16.0/20",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ]
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: subnet_ids = [
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "subnet-abcdef",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "subnet-ghjkil",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: "subnet-mnopqr",
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: ]
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: vpc_cidr = 172.31.0.0/16
TestExamplesComplete 2020-07-26T17:38:06-04:00 logger.go:66: vpc_id = vpc-7a5ce123
When I first started writing tests that compare outputs, I
naively reached for terraform.Output
and tried treating things like subnet_cidrs
as lists. Output
returns a string. Be aware of Terratest’s other output methods
such as OutputList
(seen above) and OutputMap
. These will let you use native
slices and map methods without cryptic errors.
One last thing… It takes a while, huh? That’s why I consider Terratest an “e-2-e” or “integration” test tool (vs unit tests that typically only take seconds to run). You can help this a bit by testing with intent (write efficient tests, do not strive for 100% coverage), but in the end running automation and confirming the actions it takes requires time.
Watch this video for a great overview of how to test infrastructure code.
Technically, you’ve already used modules… terraform
is a module like all the
others. Let’s extend this a bit by roping in the aws
module to do some
IaaS-specific probing:
import (
"github.com/gruntwork-io/terratest/modules/aws"
// ...
)
// ...
deploymentSubnets := terraform.OutputList(t, terraformOptions, "deployment_subnets")
for _, s := range deploymentSubnets {
assert.True(t, aws.IsPublicSubnet(t, s, "us-east-2"))
}
Our network module just exposes the default subnets provided
by AWS. In your case you likely have public and private subnets. Something like
a database cluster is usually on a set of private subnets. Just
aws.IsPrivateSubnet
, right? Good guess, but this is where browsing the
module documentation
pays off. As it turns out, there is no IsPrivateSubnet
method, but
that’s easy to work around by inverting our assertion:
privateSubnets := terraform.OutputList(t, terraformOptions, "private_subnet_ids")
for _, s := range privateSubnets {
assert.False(t, aws.IsPublicSubnet(t, s, "us-east-2"))
}
I’m sure you’ve noticed I kept showing tests, but not how I ran them… Since there’s a bit of setup and you’ll likely be running tests a lot (including in pipelines), I prefer a simple “test harness” to add consistency and reduce typing. One way is using a Makefile:
AWS_PROFILE := personal
REGION := us-east-2
VAULT_CMD := aws-vault exec $(AWS_PROFILE) --
# TF_CMD := $$GOPATH/bin/terraform
TF_CMD := terraform
export TF_DATA_DIR ?= $(CURDIR)/.terraform
export TF_CLI_ARGS_init ?= -get-plugins=true
init:
cd src && go mod init test
tidy:
cd src && go mod tidy
test:
$(TF_CMD) fmt --write=false -check -diff -recursive ..
cd src && $(VAULT_CMD) go test -v -timeout 30m
clean:
cd src && rm -rf $(TF_DATA_DIR) go.mod go.sum
Another thing to think about if running a lot of parallel tests or using Terratest in a large organization is how to safely orchestrate tests at scale…
Since tests are running Terraform and creating real resources, it’s possible to have name collisions if teams are testing in the same account. Using a module to define a consistent naming convention can help.
In larger teams, or anywhere there are “hotspots” with multiple engineers working on similar areas of your infrastructure, when pipelines randomly fire you could end up munging state. Use a module to bolt state locking on yourself, or let a tool like terragrunt handle it for you.
Often when testing you leave artifacts behind or get something in a bad state. How will you deal with these resources? One way is dedicated test accounts. Disposable environments provide full isolation. While you may ultimately run your tests in production to validate deployments, by then you will have more confidence everything works as expected. Another option is tagging resources according to a schema that allows automated cleanup. Combining these is even better, using something like aws-nuke to auto-wipe all resources in ephemeral accounts on a schedule.
If your project is more than a hobby, it’s worth testing… While wasteful tests are to be avoided, testing with intent is essential to ensure quality. Luckily, it’s easy to get up and running with Terratest. You don’t need to reinvent the wheel, you can leverage a veritable Swiss Army Knife to cover your infrastructure code and leverage the flexibility of modules to creatively extend validation beyond Terraform itself.
Best of all, Terratest is open source. Whether you want to help the community by submitting PRs or read the code to understand how it works, the code is there for you to browse and extend.