One goal that can help with the running of services in the cloud is to minimise configuration differences between environments. This isn’t always possible, sometimes you can’t escape configuration. Reading through the documentation on terraform, googling and reading stack overflow you occasionally see references to provisioners. There are:
- file
- local-exec
- remote-exec
Which sounds great, but right there in the documentation is a big warning:
Provisioners are a Last Resort
Terraform includes the concept of provisioners as a measure of pragmatism, knowing that there will always be certain behaviors that can’t be directly represented in Terraform’s declarative model.
However, they also add a considerable amount of complexity and uncertainty to Terraform usage.
Fine, so what do we do? It depends on who your cloud provider is but the documentation gives you the details. You need to pass data into instances at creation time and use cloud init to manage it. Provisioners <- you can check what the property is you need to set for your cloud provider in the documentation. The data you pass is a cloud config file.
Cloud Config
There are loads of things you can do with cloud config, checkout the documentation. Two of the items I have used to provision static files are write_files
and runcmd
.
write_files
– you can write out flat configuration files to any part of the target file system.
runcmd
– you can execute arbitrary scripts on first boot
The templatefile
function
Terraform has a function called templatefile
this does what it says on the tin. You pass a template file with variables to substitute. It’s pretty simple, in the template file variables are enclosed in ${}
again, see the documentation for advanced use.
A note on structure
You can at this point write out fairly large scale cloud config templates with embedded write_files
but it gets a bit messy. What I’ve taken to doing is having a copy of each of the files I’m going to deploy with the instance in a sub folder and nesting calls to templatefile
from terraform. My directory structure looks like this:
|- files |- consul.json |- telegraf.conf |- cloud-config.tpl |- service.tf |- provider.tf
The cloud-config.tpl file has the following contents:
#cloud-config
write_files:
- path: /usr/local/etc/consul/consul.json
encoding: base64
content: |
${ base64encode(consul_json) }
- path: /etc/telegraf/telegraf.d/telegraf.conf
encoding: base64
content: |
${ base64encode(telegraf_conf) }
- path: /var/lib/cloud/instance/scripts/initial-setup.sh
permissions: '0755'
content: |
#!/bin/bash
export address=$(ip -4 addr show ens3 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1)
sudo sed -i "s/${ip_address}/$address/g" /usr/local/etc/consul/consul.json
sudo sed -i "s/${ip_address}/$address/g" /etc/telegraf/telegraf.d/telegraf.conf
runcmd:
- [bash, /var/lib/cloud/instance/scripts/initial-setup.sh]
This is using a combination of write_files
with the contents being passed as template variables and an initial setup script, which will determine the IP address of the instance and substitute out remaining configuration keys using sed
. Of particular note is that runcmd is used to execute the script as the shell must be specified as bash
, runcmd will use the sh
shell by default on ubuntu which caused me no end of issues with script files.
Why base64encode?
in order for the yaml to be valid, the content blocks need to be indented to start at the same number of spaces (6) as those $ symbols on ${ base64encode(consul_json) }
and ${ base64encode(telegraf_conf) }
base64encoding puts the whole content block on one line then cloud config will decode it when it writes the file. If you don’t do this you may see errors like the below when dealing with ini
files. Failed loading yaml blob. Invalid format
is the important part
2022-02-14 19:36:02,702 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 10 column 1: "while scanning a simple key
in "<unicode string>", line 10, column 1:
app_mode = production
^
could not find expected ':'
in "<unicode string>", line 11, column 1:
instance_name = service- ...
^"
2022-02-14 19:36:02,743 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 10 column 1: "while scanning a simple key
in "<unicode string>", line 10, column 1:
app_mode = production
^
could not find expected ':'
in "<unicode string>", line 11, column 1:
instance_name = service- ...
^"
2022-02-14 19:36:02,744 - util.py[WARNING]: Failed at merging in cloud config part from part-001
The nested files
Ok, so that’s the cloud config file that glues this all together, the configuration files themselves are now just standard templatefile
syntax:
consul.json
{
"server": false,
"datacenter": "dc1",
"node_name": "${host_name}",
"data_dir": "/var/consul/data",
"bind_addr": "${ip_address}",
"client_addr": "127.0.0.1",
"retry_join": ["${consul_address_1}", "${consul_address_2}", "${consul_address_3}"],
"log_level": "DEBUG",
"enable_syslog": true,
"acl_enforce_version_8": false,
"enable_local_script_checks": true,
"service": {
"name": "service",
"tags": ${jsonencode([for tag in consul_tags : "${tag}"])},
"address": "${ip_address}",
"port": 8120,
"check": {
"args": [
"curl",
"http://${ip_address}:8120"
],
"interval": "10s"
}
}
}
telegraf.conf
[[outputs.influxdb_v2]]
## The full HTTP or UDP URL for your InfluxDB instance.
##
## Multiple URLs can be specified for a single cluster, only ONE of the
## urls will be written to each interval.
# urls = ["unix:///var/run/influxdb.sock"]
# urls = ["udp://127.0.0.1:8089"]
urls = ["http://influxdb.service.consul:8086"]
token = "${influxdb_token}"
organization = "telegraf"
bucket = "telegraf"
[global_tags]
role = "service"
ip_address = "${ip_address}"
host_name = "${host_name}"
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = false
[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.net]]
[[inputs.netstat]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]
Stitching it all together
So we have all the pieces, how does it look in the service.tf
file? It’s just nested templatefile
calls with the parameters passed in a map!
resource "openstack_compute_instance_v2" "service" {
name = var.INSTANCE_NAME
provider = openstack.openstack # Provider name
image_name = "Service ${var.VERSION_NUMBER}" # Image name
flavor_name = "s1-2" # Instance type name
key_pair = "development"
user_data = templatefile("cloud-config.tpl", {
consul_json = templatefile("files/consul.json", {
consul_address_1 = "consul-1-${var.ENVIRONMENT}.my-company.co.uk"
consul_address_2 = "consul-2-${var.ENVIRONMENT}.my-company.co.uk"
consul_address_3 = "consul-3-${var.ENVIRONMENT}.my-company.co.uk"
consul_tags = var.CONSUL_TAGS
})
exit_conf = templatefile("files/telegraf.conf", {
host_name = var.INSTANCE_NAME
influxdb_token = data.vault_generic_secret.influxdb_token.data["token"]
})
})
network {
name = "Ext-Net" # Adds the network component to reach your instance
}
}
And that’s all of it. There’s even an influxdb 2 token pulled out of the vault provider set in the config above. I’ve made a pull request to integrate an influxdb 2 database secret provider with vault so watch this space. A blog post on using ephemeral accounts to write metrics to influxdb will be coming soon.
Please let me know in the comments below if anything is missing or needs clarifying. I’m really starting to love terraform, all the above has been done because I wanted a clean and easy to follow way of deploying configuration as part of provisioning.
One response to “Environment specific Configuration with Terraform”
As an ex-IT professional, I have to say I don’t understand a word of it!
LikeLike