Getting Out More

Clone the companion project to follow along…

In a past series we used Terraform to provision public and private subnets in a custom VPC within a configurable AWS region. In AWS terms, the difference between “public” and “private” subnets is simply that public subnets are connected to the Internet. That connection is made through an Internet Gateway (IGW). Along with provisioning the IGW itself, we add Route Tables and Route Table Associations to tie our public subnets to an IGW.

In our simple N-tier project, we used public subnets to house an ALB which directed HTTP traffic to EC2 instances within our private subnets. To keep things simple, we started a BusyBox process via custom User Data that listened on port 80 and served our static content.

To take a small step toward making this more real-world, let’s use NGINX instead… Aside from being a capable and highly-performant HTTP proxy, this presents an opportunity to learn about more AWS network plumbing. Specifically, we’ll need to configure NAT gateways (NGWs) so our private subnets can reach the Internet to run APT commands.

Gathering the Pieces

Much like an IGW, putting a NGW to good use requires a few parts carefully wired together. As it turns out, we’ll usually want multiple NGWs – one per AZ. You can route all traffic through a single NGW, but if a single AZ goes down all traffic can be impacted. By placing a NGW in each region hosting private subnets, we not only provide better HA but also mitigate key scaling factors (bandwidth and session count) by better distributing traffic.

We also need an Elastic IP (EIP) for each NGW (the addresses through which we’ll masquerade), and route tables that let our private subnets know how get to the Internet using our shiny new NGWs. Last but not least, we’ll want to tweak userdata.sh to use NGINX vs BusyBox…which will be cleaner and set us up for future improvements!

Putting it Together

Now that we know what’s needed, time to code…

resource "aws_nat_gateway" "ngw" {
  count         = length(data.aws_availability_zones.available.names)
  subnet_id     = element(aws_subnet.public_subnets[*].id, count.index)
  allocation_id = element(aws_eip.nat_eip[*].id, count.index)
  depends_on    = [aws_internet_gateway.igw]

  tags = {
    "Name" = "${var.env_name}-ngw-${count.index}"
  }
}

resource "aws_eip" "nat_eip" {
  count      = length(data.aws_availability_zones.available.names)
  vpc        = true
  depends_on = [aws_internet_gateway.igw]
  tags = {
    "Name" = "${var.env_name}-nat-eip-${count.index}"
  }
}

resource "aws_route_table" "private_route" {
  count  = length(data.aws_availability_zones.available.names)
  vpc_id = aws_vpc.vpc.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = element(aws_nat_gateway.ngw[*].id, count.index)
  }

  tags = {
    "Name" = "${var.env_name}-private-route-${count.index}"
  }
}

resource "aws_route_table_association" "private_rta" {
  count          = length(data.aws_availability_zones.available.names)
  subnet_id      = element(aws_subnet.private_subnets[*].id, count.index)
  route_table_id = element(aws_route_table.private_route[*].id, count.index)
}

Wiring up a NGW is very similar to an IGW…If you’ve read through the original series, the tricks we use to spin up NGWs on our public subnets in each AZ will be familiar. New here, we’ve established an explicit dependency between our NGWs and our IGW. This is best practice from the Terraform documentation. We do the same for our EIPs, and specify these are to be placed in a VPC.

The last two steps are connecting our private subnets to appropriate NGWs using a new routing table, and are almost identical to previous steps for wiring up our public subnets to the IGW. It is perhaps subtle, but one gotcha to pay attention to is ensuring any route tables directing traffic to NGWs use nat_gateway_id vs gateway_id. Terraform will plan and apply either way, but the configuration will never fully converge in the latter case. If you’re working on something similar and keep noticing route table changes, this is a likely culprit.

Recall that we had already configured a Launch Configuration which injected our custom user data into EC2 instances:

resource "aws_launch_configuration" "lc" {
  # avoid static name so resource can be updated
  name_prefix     = "${var.env_name}-lc-"
  image_id        = data.aws_ami.ubuntu.id
  instance_type   = var.web_instance_type
  security_groups = [aws_security_group.http_ingress_instance.id]
  user_data = templatefile("userdata.sh", {
    web_port    = var.web_port,
    web_message = var.web_message,
    db_endpoint = aws_db_instance.rds.endpoint,
    db_name     = aws_db_instance.rds.name,
    db_username = aws_db_instance.rds.username,
    db_status   = aws_db_instance.rds.status
  })
}

To get NGINX up and running, we simply adjust userdata.sh as needed:

#!/bin/bash

DEBIAN_FRONTEND=noninteractive apt update
DEBIAN_FRONTEND=noninteractive apt install nginx -y

cat >/var/www/html/index.html <<EOF
<html>
<head>
  <title>Success!</title>
</head>
<body>
  <h1>${web_message}</h1>
  <ul>
    <li><b>RDS endpoint:</b> ${db_endpoint}</li>
    <li><b>Database name:</b> ${db_name}</li>
    <li><b>Database user:</b> ${db_username}</li>
    <li><b>Database password:</b> Yeah right! :-)</li>
    <li><b>Database status:</b> ${db_status}</li>
  </ul>
  <pre>
    $(systemctl status nginx)
  </pre>
</body>
</html>
EOF

Next Steps

Now we are starting to get a more respectable web cluster, with the ability to install custom packages or other updates on instances living in our private subnets. NAT alone is not a security mechanism, but when combined with our Security Groups this is shaping up.

The next goal will be leveraging ACM and Route53 to get our ALB accepting TLS traffic on port 443 and using a friendly DNS name. Be sure to check back for the continued evolution of our Terraforming AWS experiment!