Nov 30, 2017
NAT Gateways in Amazon GovCloud
So you’re deploying your government-sensitive data and services on GovCloud, or planning to and you want your
data to be protected against third-party access, so you configure
your subnets as private resources, without internet access. In
other AWS regions, you could then add a managed NAT Gateway and
instances would have, once configured, egress available for
internet access. This allows them to update their software and run
smoothly pulling necessary external information.
But GovCloud has no managed NAT
Gateways. Instead you must create NAT instances and manually wire
them to your network. This post will show you how to do that in an
easy way. If you have ever read one of our tutorial posts in the
GovCloud series, you can safely skip the “Pre-requisites” part, and
skim through “Initial setup”.
Pre-requisites
Terraform
AWS command-line tools — i.e. aws-cli
Amazon GovCloud credentials configured for aws-cli
Git for version control
Terraform is used to provision
cloud resources in a declarative manner, creating reproducible
environments between engineers. You should read more how it and
other technologies help streamline operations productivity here. We
don’t provide an intro to Terraform here, but you should be able to
follow along even if you don’t know much about it, filling the gaps
with official documentation if needed.
We assume you will be using a
terminal emulator under a UNIX or UNIX-like operating system. Your
GovCloud account must have enough availability in its EC2 quota for
at least one more instance. If you never had to deal with quotas
before, you likely will not exceed your quota with the extra
instance and can ignore this point. Examples preceded by $ means the line is supposed to be typed and run from your terminal emulator:
Initial setup
We will create an empty environment for deployment, and add fpco-terraform-aws,
which includes the module you will need. FP Complete created and
open-sourced fpco-terraform-aws as a collection of modules that
ease managing resources in an AWS cloud. As you will see below, a
few of its modules are used, and they considerably shorten the
amount of code you need to build your environment. We will need a
git repository to bring in fpco-terraform-aws,
and it’s also good practice to always version your infrastructure.
This way, not only you but your team can collaborate in developing
and operating it:
On vpc/main.tf you will add provider information
for AWS and your usual credentials for SSH access. Other files
created include:
terraform.tfvars,
holding your credentials and other parameters, never added to
version control;
network.tf, where the VPC will reside;
nat.tf has the NAT instance;
variables.tf, where you will declare variables we use in the examples below.
Assume that if a variable is
mentioned but wasn’t declared in an example, you should declare it
yourself under variables.tf and provide a default that you intend to use.
Building a VPC
Next, we need a VPC where our
private subnets will live. However, the NAT instance that provides
access to the resources on the private subnets has to live on a
public subnet. For this reason, we will now create a public and
private subnet. Add the following to vpc/network.tf:
You will have to decide on CIDR
ranges for your VPC based on networking requirements of your
organisation. Once you have those decided, set them on variables.tf with a variable called cidr, and subranges for the public and private subnets as public_subnet_cidrs and private_subnet_cidrs.
Other variables mentioned follow similar patterns, and you should
set them according to your needs.
Creating a NAT instance
Now we have a VPC with two
subnets, and an internet gateway usable and routed by for the
public subnet. To configure a NAT instance, we will create an EC2
virtual machine on the public subnet that routes all traffic
directed to it outside and back. The private subnet will them use
this instance on its route table, directing resources residing
inside to it.
Configuring a NAT instance entails dealing with iptables and pre-configuring some things correctly. Luckily, we have already published a Terraform module on fpco-terraform-aws that gives you NAT instance completely ready for use on GovCloud. Let’s add it to vpc/nat.tf:
You will notice that we had to pass is_govcloud to ensure it creates the instance
correctly for that region. But this also means you can use this
module on other regions by omitting or setting this parameter
to false. You will also notice that there are a few
security groups we did not set yet. The first one deals with
allowing only web access for hosts on the private subnet. Add it on
the same file:
The other security groups can be
reused for other future instances. One will give open egress and
the other SSH access to the NAT instance for debugging purposes.
Both are already packaged by fpco-terraform-aws,
easing inclusion of such capabilities on your infrastructure. On a
project with many different security groups, you could separate
them into a vpc/sgs.tf file, but for this tutorial you can add them to vpc/network.tf instead:
You can add a private-ssh-sg if you want
access to the private instances also, using the NAT instance
through a bastion host. In fact, this is an exercise we present
below. Let’s keep it simple for now. With all security groups
ready, we can proceed to wiring the private subnet to the NAT
instance.
Setting up routes
To connect the private subnet to
the outside world, we need to explicitly create a route table, a
route for NAT on it, and finally a routing table association for
the private subnet. This is achieved with a few lines of code
on vpc/network.tf:
Notice how in the last one we indexed the first subnet_id for the route table association. If
you have multiple CIDR ranges, then you would likely want to do
this differently, by using counts.
The usage above keeps the example simple, and changing it is a good
exercise to the reader on how to ensure changes in variables
propagate correctly to the creation of multiple
resources.
Testing
Now that we have everything in
place, it should be possible to access the internet from within
resources in the private subnet. As a final exercise, you should
create a security group for private SSH access, an EC2 instance on
the private subnet with this security group, and then SSH into it
with your key defined on vpc/main.tf,
finally testing internet access. On the example test below, we
assume you have an Ubuntu Server image running on the private
subnet, with an entry under ~/.ssh/config for its host using NAT as a bastion host named as private-ec2-nat-to-internet:
Getting access to the host
through the bastion host means you have correctly set up security
groups and your SSH config. Updating the instance afterwards is
good practice. Finally, installing Haskell Stack tests access to an
external resource.
Where to go next
Now that you have private
resources able to access the internet, but still protected against
external access themselves, there are multiple opportunities for
improvement to tackle. You could add the capability to have
configurable DNS on GovCloud, a topic we have discussed on a separate post. You can
also automate further the creation of other resources by tapping
into our fpco-terraform-aws modules, such as easier selection of Ubuntu AMI images on GovCloud with ami-ubuntu or simple initialisation of inline templates with init-snippet.
And lastly, you could separate
security groups in their own files and ensure your SSH key
generation and host configurations are secure by default. We will
touch on this in a future post in this GovCloud series, so you
should subscribe for updates by going to the top of this post and
including your email. This is a low traffic mailing list, and you
can expect to receive only useful updates to improve your DevOps
and Haskell skills.
Related articles
Intro to DevOps on GovCloud
Amazon GovCloud has no Route53! How to solve this?
Containerizing a legacy application: an overview