Basic infrastructure configuration using Terraform, Docker (Ghost, Traefik) and Cloudflare

TLDR; A walk through an EC2 instance setup, dynamically attaching volumes snapshots and configuring Cloudflare DNS entries pointing to the new instance using Terraform and Docker (Ghost, Traefik).

Source code: https://github.com/allandequeiroz/blog/tree/extract_variables

Basic infrastructure configuration using Terraform, Docker (Ghost, Traefik) and Cloudflare.

For a while, I've hosted this blog, at home, using Docker and Kubernetes across a few Raspberries (I use it as an excuse to do play around). It turns out that I'm moving to a new place, so I have a new excuse to play with something else.

The idea at this time is simply to provide the infrastructure on EC2 using Terraform, set up the DNS entries and keep the blog alive, later on, I'll see what can I do with Kubernetes and perhaps include some load-balancing and autoscaling as well (even completely unnecessary due the low traffic) but for now let's take a look at the current configurations:

  • Docker (docker-compose)
    • Ghost
    • Traefik
  • Traefik (toml)
  • Terraform
    • EC2
    • Cloudflare

Docker

Ghost and Traefik

Different from my previous setup, at this time I've decided to move out from my customized Ghost version and start using a vanilla one provided by Ghost's maintainers available on Docker hub, the only thing to keep was the config.production.json file to set details such as port, database and content placement; if you want to check this file, it's available at the repository.

Also, to play with something new, I've replaced NGINX by Traefik; it was quite interesting to know that Traefik deals with all the underline details also is quite dynamic in terms of self-reconfiguration, we give it a .toml file and the tricks start to happen; we'll see a bit more later on, for now, let's have a look at the docker-compose file.

As you can see, there are not so many details; we're basically specifying two services, a network and setting up some labels to help Traefik does its job. The volumes mappings are mostly optional but, here we're placing some existing configuration and persisting data outside of the containers, at the host file system, no strict rules here as well, in this case for example, since I've told Ghost that the content path is /var/lib/ghost/content a volume was mounted at this same location so the whole data will be persisted at the host FS, the volumes' snapshots will have the same content, so when new EC2 instances are launched, data from previous instances will be present.

The labels as mentioned before are to help Traefik but for now just notice the mapping to docker.sock, if you're not familiar with its usage, is basically the Unix socket that the Docker daemon listens to so containers can communicate with it from within, basically containers are able to consume the API through the socket. Traefik, in this case, observe the Docker events through the API and depending on the events, decides what to do about the current configuration, if something needs to be destroyed, created, recreated, changed and so on.

Traefik

Traefik is a very clever reverse proxy; it deals by itself with many complexities for a low price in terms of configuration as you can see below.

From this short configuration we have support to HTTP/HTTPS, ports configuration, the whole boilerplate to generate our certificates with Let's Encrypt and a dashboard.

Concerning the dashboard, worth to mention that using the [api] section exposes Traefik's configuration so remember to secure it with some authentication/authorization mechanism, in this example basic auth was used, when hitting the dashboard URL a pop up asking for credentials will be prompted. The password is generated by htpasswd for example echo $(htpasswd -nb <USER> password) | sed -e s/\\$/\\$\\$/g

Terraform

Now let's have a look at the Terraform infrastructure definition for both EC2 and Cloudflare, I think a few things should be mentioned.

  • First, there many ways to define and load variables with Terraform but the ones at this example are being loaded from the environment, when Terraform is executed it'll start the lookup, the environment variables prefixed with TF_VAR will be considered by Terraform. As an example, at the .tf file we have the variable aws_access_key_id, the defined environment variable was TF_VAR_aws_access_key_id, for Terraform it's a match.
  • Second, we can break down the configurations as much as we want, just keep in mind that the files are loaded in alphabetical order, in this case it worked fine since Terraform will start from __a__ws.tf then proceed with __c__loudflare.tf. A necessity here since to define the DNS entries the EC2 instance IP address should be defined already.
  • Third, the remote-exec is not the most optimal approach, would be possible, for example, to have AMIs ready to go with everything necessary installed and configured beforehand.

EC2

As said before, you can have a look at the full files at the repository, but here, I'll break down in sections to give a short explanation for each one.

  1. Variables definitions, to make our configuration flexible/reusable also to hide sensitive information.
  2. Virtual Private Cloud (VPC) since a VPC is not like a regular datacenter with networks, switches, routers and so on but, instead, software-defined, we need(optional in fact) to define a private(isolated) section of the Cloud to launch our instances.
  3. VPC Subnet once defined our private Cloud; we need to specify our subnet to provide the behaviours we want, in this case, the subnet is associated with an Internet Gateway, turning this particular subnet into a public one so we can access this particular network space from the external world.
  4. Internet Gateway as mentioned above, the intention here is to make a particular subnet accessible from the internet, the gateway will work with the subnet to make it happen.
  5. Route Table definition into the VPC to route traffic to the Internet Gateway.
  6. Route Table Association to "link" a Subnet with a Route Table.
  7. Security Group to be used inside the VPC (this example shouldn't be used in production, it's exceptionally permissive).
  8. Amazon Machine Image (AMI) to specify details such as a particular image to be used to start new instances or to attaching a previously taken volume snapshot for example.
  9. Key Pair to provide access to the instance over SSH.
  10. EC2 Instance a description of how a new instance should look like, where and how it should be placed, for example, into which subnet, the Security Group to be used, Key Pair, which Volume Snapshot to be also used to execute commands over SSH. Many thinks are possible here; the previous description is about the context of this example.

With this in mind, let's have a look at each of these sections.

Variables

As mentioned before, Terraform allows us to define variables so we can make our configurations more flexible also keep sensitive information out of sight, one interesting detail is that we can define default values, for example, we could have something like

VPC

The intention of defining a VPC is to have complete control but is not, in fact, a necessity, in case we skip this definition, a default one will be provided by AWS, but of course, AWS won't guess what we intend.

VPC Subnet

The subnet definition is also important, in this case, there's nothing fancy going on, but we could use it to create different subnets some to be exposed and some to be completely private.

Internet Gateway

To achieve the goal of routing internet traffic into the VPC, we need to specify an Internet Gateway.

Route Table

The route table is what we use in association with the gateway to exposing a particular subnet to the internet.

Route Table Association

This small section is the glue between a subnet and a route table; we use it to put them together.

Security Group

By default, AWS allows all traffic to go outside but terraform doesn't so we need to state this definition explicitly, in this case, in and out.

AMI

This section is important to define how the instances should "look like", in this case, as you can see, we have three sections.

  • data: aws_ebs_snapshot to lookup for a specific volume snapshot to be mounted when creating new instances.
  • data: aws_ami to lookup for a specific AMI to be used as a base for the new instance.
  • resource: aws_ami to state the definitions of how the instances should be created, you can see that we've made use of both filterings we've defined above, one to set the block device and another to set which particular AMI to be used when instantiating.

The snapshots are taken automatically, the periodicity and target were defined beforehand through the Elastic Block Store Lifecycle Manager.

Key Pair

Here is the definition of which Key Pair to be associated with the new instances, this way if in possession of the .pem file, access to the instance over SSH is possible.

EC2 Instance

Ok, here is where we put everything together that's why this section is slightly bigger, but if you read it line by line, you'll notice that now we're only setting values using what was defined before so you can see that there are only three sections.

  • Values definitions the first seven lines where the values are coming from what was configured before.
  • Connection configuration lines from nine to fourteen are describing how the connectivity should work.
  • Commands the remaining lines (except for the tag definition) are the commands to be executed while spinning up a new instance.

The interesting thing here is about remote-exec because Docker is being installed and its service needs to be started; a new connection should happen, permissions also won't work correctly if the connection is not closed, only from a new session the remaining commands will work appropriately.

Another detail to is the usage of tags, once the infrastructure is created, these tags will be present on each of these parts, they're handy for organization and filtering purposes, at the end of the day I was feeling like tagging here was so important and useful as if I was doing something on Kubernetes for example.

Cloudflare

  1. A record defining the target domain, associated with the new instance IP address (we could be pointing to an LB for example).
  2. Canonical Name record (CNAME) to provide www. subdomain.
  3. Canonical Name record (CNAME) to provide access to Traefik dashboard, do you remember the labels at the docker-compose.yml? We've defined two traefik.frontend.rule entries, one for Ghost and another to Traefik so, given the requested URL, Traefik will redirect the request to the correct container.

The Cloudflare configuration is not extensive so let's keep it as a single piece, there's nothing mythical here but one line that worth to mention is the line 12. Here the IP associated with the new instance is taken and set to the new DNS entry, at first I got some errors at this part because the IP wasn't yet defined, the line cloud-init status --wait at the end of the aws_instance creation tackles this problem, it will make Terraform wait until AWS finishes its job before carrying on with Cloudflare.

Conclusion

After playing with this, apart of the usual excitement, it feels like we need something more powerful to work with configurations, I can imagine the number of files and probably untrackable repetitions for more complex configurations, fingers crossed that we have something better any time soon, maybe Dhall will be better and yet safe option, who knows.

So even with the current configuration scenario is quite fun to go through all this and see a whole infrastructure being created/destroyed in a matter of seconds thanks to the automation capabilities, tools and options available we have these days.