Environment specific Configuration with Terraform

One goal that can help with the running of services in the cloud is to minimise configuration differences between environments. This isn’t always possible, sometimes you can’t escape configuration. Reading through the documentation on terraform, googling and reading stack overflow you occasionally see references to provisioners. There are:

  • file
  • local-exec
  • remote-exec

Which sounds great, but right there in the documentation is a big warning:

Provisioners are a Last Resort

Terraform includes the concept of provisioners as a measure of pragmatism, knowing that there will always be certain behaviors that can’t be directly represented in Terraform’s declarative model.

However, they also add a considerable amount of complexity and uncertainty to Terraform usage.

Fine, so what do we do? It depends on who your cloud provider is but the documentation gives you the details. You need to pass data into instances at creation time and use cloud init to manage it. Provisioners <- you can check what the property is you need to set for your cloud provider in the documentation. The data you pass is a cloud config file.

Cloud Config

There are loads of things you can do with cloud config, checkout the documentation. Two of the items I have used to provision static files are write_files and runcmd.

write_files – you can write out flat configuration files to any part of the target file system.

runcmd – you can execute arbitrary scripts on first boot

The templatefile function

Terraform has a function called templatefile this does what it says on the tin. You pass a template file with variables to substitute. It’s pretty simple, in the template file variables are enclosed in ${} again, see the documentation for advanced use.

A note on structure

You can at this point write out fairly large scale cloud config templates with embedded write_files but it gets a bit messy. What I’ve taken to doing is having a copy of each of the files I’m going to deploy with the instance in a sub folder and nesting calls to templatefile from terraform. My directory structure looks like this:

|- files
   |- consul.json
   |- telegraf.conf
|- cloud-config.tpl
|- service.tf
|- provider.tf

The cloud-config.tpl file has the following contents:

#cloud-config
write_files:
  - path: /usr/local/etc/consul/consul.json
    encoding: base64
    content: |
      ${ base64encode(consul_json) }
  - path: /etc/telegraf/telegraf.d/telegraf.conf
    encoding: base64
    content: |
      ${ base64encode(telegraf_conf) }
  - path: /var/lib/cloud/instance/scripts/initial-setup.sh
    permissions: '0755'
    content: |
      #!/bin/bash
      export address=$(ip -4 addr show ens3 | grep "inet\b" | awk '{print $2}'  | cut -d/ -f1)
      sudo sed -i "s/${ip_address}/$address/g" /usr/local/etc/consul/consul.json
      sudo sed -i "s/${ip_address}/$address/g" /etc/telegraf/telegraf.d/telegraf.conf
runcmd:
  - [bash, /var/lib/cloud/instance/scripts/initial-setup.sh]

This is using a combination of write_files with the contents being passed as template variables and an initial setup script, which will determine the IP address of the instance and substitute out remaining configuration keys using sed. Of particular note is that runcmd is used to execute the script as the shell must be specified as bash, runcmd will use the sh shell by default on ubuntu which caused me no end of issues with script files.

Why base64encode?

in order for the yaml to be valid, the content blocks need to be indented to start at the same number of spaces (6) as those $ symbols on ${ base64encode(consul_json) } and ${ base64encode(telegraf_conf) } base64encoding puts the whole content block on one line then cloud config will decode it when it writes the file. If you don’t do this you may see errors like the below when dealing with ini files. Failed loading yaml blob. Invalid format is the important part

2022-02-14 19:36:02,702 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 10 column 1: "while scanning a simple key
  in "<unicode string>", line 10, column 1:
    app_mode = production
    ^
could not find expected ':'
  in "<unicode string>", line 11, column 1:
    instance_name = service- ... 
    ^"
2022-02-14 19:36:02,743 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 10 column 1: "while scanning a simple key
  in "<unicode string>", line 10, column 1:
    app_mode = production
    ^
could not find expected ':'
  in "<unicode string>", line 11, column 1:
    instance_name = service- ... 
    ^"
2022-02-14 19:36:02,744 - util.py[WARNING]: Failed at merging in cloud config part from part-001

The nested files

Ok, so that’s the cloud config file that glues this all together, the configuration files themselves are now just standard templatefile syntax:

consul.json

{
    "server": false,
    "datacenter": "dc1",
    "node_name": "${host_name}",
    "data_dir": "/var/consul/data",
    "bind_addr": "${ip_address}",
    "client_addr": "127.0.0.1",
    "retry_join": ["${consul_address_1}", "${consul_address_2}", "${consul_address_3}"],
    "log_level": "DEBUG",
    "enable_syslog": true,
    "acl_enforce_version_8": false,
    "enable_local_script_checks": true,
    "service": {
        "name": "service",
        "tags": ${jsonencode([for tag in consul_tags : "${tag}"])},
        "address": "${ip_address}",
        "port": 8120,
        "check": {
            "args": [
                "curl",
                "http://${ip_address}:8120"
            ],
            "interval": "10s"
        }
    }
}

telegraf.conf

[[outputs.influxdb_v2]]
## The full HTTP or UDP URL for your InfluxDB instance.
##
## Multiple URLs can be specified for a single cluster, only ONE of the
## urls will be written to each interval.
# urls = ["unix:///var/run/influxdb.sock"]
# urls = ["udp://127.0.0.1:8089"]
urls = ["http://influxdb.service.consul:8086"]
token = "${influxdb_token}"
organization = "telegraf"
bucket = "telegraf"
[global_tags]
role = "service"
ip_address = "${ip_address}"
host_name = "${host_name}"
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = false
[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.net]]
[[inputs.netstat]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]

Stitching it all together

So we have all the pieces, how does it look in the service.tf file? It’s just nested templatefile calls with the parameters passed in a map!

resource "openstack_compute_instance_v2" "service" {
  name        = var.INSTANCE_NAME
  provider    = openstack.openstack  # Provider name
  image_name  = "Service ${var.VERSION_NUMBER}" # Image name
  flavor_name = "s1-2" # Instance type name
  key_pair    = "development"  
  user_data   = templatefile("cloud-config.tpl", {
    consul_json   = templatefile("files/consul.json", {
      consul_address_1 = "consul-1-${var.ENVIRONMENT}.my-company.co.uk"
      consul_address_2 = "consul-2-${var.ENVIRONMENT}.my-company.co.uk"
      consul_address_3 = "consul-3-${var.ENVIRONMENT}.my-company.co.uk"
      consul_tags      = var.CONSUL_TAGS
    })
    exit_conf     = templatefile("files/telegraf.conf", {
      host_name      = var.INSTANCE_NAME
      influxdb_token = data.vault_generic_secret.influxdb_token.data["token"]
    })
  })

  network {
    name = "Ext-Net" # Adds the network component to reach your instance
  }
}

And that’s all of it. There’s even an influxdb 2 token pulled out of the vault provider set in the config above. I’ve made a pull request to integrate an influxdb 2 database secret provider with vault so watch this space. A blog post on using ephemeral accounts to write metrics to influxdb will be coming soon.

Please let me know in the comments below if anything is missing or needs clarifying. I’m really starting to love terraform, all the above has been done because I wanted a clean and easy to follow way of deploying configuration as part of provisioning.


One response to “Environment specific Configuration with Terraform”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: