Terraform is a tool for building and managing infrastructure in an organized way. You can use it to manage DigitalOcean Droplets, Load Balancers, and even DNS entries, in addition to a large variety of services offered by other providers. Terraform uses a command-line interface and can run from your desktop or a remote server.
Terraform works by reading configuration files that describe the components that make up your application environment or datacenter. Based on the configuration, it generates an execution plan that describes what it will do to reach the desired state. You then use Terraform to execute this plan to build the infrastructure. When changes to the configuration occur, Terraform can generate and execute incremental plans to update the existing infrastructure to the newly described state.
In this tutorial, you’ll install Terraform and use it to create an infrastructure on DigitalOcean that consists of two Nginx servers that are load-balanced by a DigitalOcean Load Balancer. Then, you’ll use Terraform to add a DNS entry on DigitalOcean that points to your Load Balancer. This will help you get started with using Terraform and give you an idea of how you can use it to manage and deploy a DigitalOcean-based infrastructure that meets your own needs.
Note: This tutorial has been tested with Terraform 1.7.2
.
To complete this tutorial, you’ll need:
Terraform is a command-line tool that you run on your desktop or on a remote server. To install it, you’ll download it and place it on your PATH
so you can execute it in any directory you’re working in.
First, download the appropriate package for your OS and architecture from the official Downloads page. If you’re on macOS or Linux, you can download Terraform with curl
.
On macOS, use this command to download Terraform and place it in your home directory:
curl -o ~/terraform.zip https://releases.hashicorp.com/terraform/1.7.2/terraform_1.7.2_darwin_amd64.zip
On Linux, use this command:
curl -o ~/terraform.zip https://releases.hashicorp.com/terraform/1.7.2/terraform_1.7.2_linux_amd64.zip
Create the ~/opt/terraform
directory:
mkdir -p ~/opt/terraform
Then, unzip Terraform to ~/opt/terraform
using the unzip
command. On Ubuntu, you can install unzip
using apt
:
sudo apt install unzip
Use it to extract the downloaded archive to the ~/opt/terraform
directory by running:
unzip ~/terraform.zip -d ~/opt/terraform
Finally, add ~/opt/terraform
to your PATH
environment variable so you can execute the terraform
command without specifying the full path to the executable.
On Linux, you’ll need to redefine PATH
in .bashrc
, which runs when a new shell opens. Open it for editing by running the following:
nano ~/.bashrc
Note: On macOS, add the path to the file .bash_profile
if using Bash or to .zshrc
if using ZSH.
To append Terraform’s path to your PATH, add the following line at the end of the file:
export PATH=$PATH:~/opt/terraform
Save and close the file when you’re done.
Now all of your new shell sessions will be able to find the terraform
command. To load the new PATH
into your current session, run the following command if you’re using Bash on a Linux system:
. ~/.bashrc
If you’re using Bash on macOS, execute this command instead:
. .bash_profile
If you’re using ZSH, run this command:
. .zshrc
To verify that you have installed Terraform correctly, run the terraform
command with no arguments:
terraform
You will see output that is similar to the following:
OutputUsage: terraform [global options] <subcommand> [args]
The available commands for execution are listed below.
The primary workflow commands are given first, followed by
less common or more advanced commands.
Main commands:
init Prepare your working directory for other commands
validate Check whether the configuration is valid
plan Show changes required by the current configuration
apply Create or update infrastructure
destroy Destroy previously-created infrastructure
All other commands:
console Try Terraform expressions at an interactive command prompt
fmt Reformat your configuration in the standard style
force-unlock Release a stuck lock on the current workspace
get Install or upgrade remote Terraform modules
graph Generate a Graphviz graph of the steps in an operation
import Associate existing infrastructure with a Terraform resource
login Obtain and save credentials for a remote host
logout Remove locally-stored credentials for a remote host
output Show output values from your root module
providers Show the providers required for this configuration
refresh Update the state to match remote systems
show Show the current state or a saved plan
state Advanced state management
taint Mark a resource instance as not fully functional
test Experimental support for module integration testing
untaint Remove the 'tainted' state from a resource instance
version Show the current Terraform version
workspace Workspace management
Global options (use these before the subcommand, if any):
-chdir=DIR Switch to a different working directory before executing the
given subcommand.
-help Show this help output, or the help for a specified subcommand.
-version An alias for the "version" subcommand.
These are the commands that Terraform accepts. The output gives you a brief description, and you’ll learn more about them throughout this tutorial.
Now that Terraform is installed let’s configure it to work with DigitalOcean’s resources.
Terraform supports a variety of service providers through providers you can install. Each provider has its specifications, which generally map to the API of its respective service provider.
The DigitalOcean provider lets Terraform interact with the DigitalOcean API to build out infrastructure. This provider supports creating various DigitalOcean resources, including the following:
Terraform will use your DigitalOcean Personal Access Token to communicate with the DigitalOcean API and manage resources in your account. Don’t share this key with others, and keep it out of scripts and version control. Export your DigitalOcean Personal Access Token to an environment variable called DO_PAT
by running:
export DO_PAT="your_personal_access_token"
This will make using it in subsequent commands easier and keep it separate from your code.
Note: If you’ll be working with Terraform and DigitalOcean often, add this line to your shell configuration files using the same approach you used to modify your PATH
environment variable in the previous step.
Create a directory that will store your infrastructure configuration by running the following command:
mkdir ~/loadbalance
Navigate to the newly created directory:
cd ~/loadbalance
Terraform configurations are text files that end with the .tf
file extension. They are human-readable, and they support comments. (Terraform also supports JSON-format configuration files, but they won’t be covered here.) Terraform will read all of the configuration files in your working directory in a declarative manner, so the order of resource and variable definitions do not matter. Your entire infrastructure can exist in a single configuration file, but you should separate the configuration files by resource type to maintain clarity.
The first step to building an infrastructure with Terraform is to define the provider you’re going to use.
To use the DigitalOcean provider with Terraform, you have to tell Terraform about it and configure the plugin with the proper credential variables. Create a file called provider.tf
, which will store the configuration for the provider:
nano provider.tf
Add the following lines into the file to tell Terraform that you want to use the DigitalOcean provider and instruct Terraform where to find it:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
Then, define the following variables in the file so you can reference them in the rest of your configuration files:
do_token
: your DigitalOcean Personal Access Token.pvt_key
: private key location, so Terraform can use it to log in to new Droplets and install Nginx.You will pass the values of these variables into Terraform when you run it, rather than by hard-coding the values here. This makes the configuration more portable.
To define these variables, add these lines to the file:
...
variable "do_token" {}
variable "pvt_key" {}
Then, add these lines to configure the DigitalOcean provider and specify the credentials for your DigitalOcean account by assigning the do_token
to the token
argument of the provider:
...
provider "digitalocean" {
token = var.do_token
}
Finally, you’ll want to have Terraform automatically add your SSH key to any new Droplets you create. When you added your SSH key to DigitalOcean, you gave it a name. Terraform can use this name to retrieve the public key. Add these lines, replacing terraform
with the name of the key you provided in your DigitalOcean account:
...
data "digitalocean_ssh_key" "terraform" {
name = "terraform"
}
Your completed provider.tf
file will look like this:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
variable "do_token" {}
variable "pvt_key" {}
provider "digitalocean" {
token = var.do_token
}
data "digitalocean_ssh_key" "terraform" {
name = "terraform"
}
When you’re done, save and close the file.
Note: Setting the TF_LOG
environment variable to 1
will enable detailed logging of what Terraform is trying to do. You can set it by running:
export TF_LOG=1
Initialize Terraform for your project by running:
terraform init
This will read your configuration and install the plugins for your provider. You’ll see that logged in the output:
OutputInitializing the backend...
Initializing provider plugins...
- Finding digitalocean/digitalocean versions matching "~> 2.0"...
- Installing digitalocean/digitalocean v2.34.1...
- Installed digitalocean/digitalocean v2.34.1 (signed by a HashiCorp partner, key ID F82037E524B9C0E8)
Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
If you happen to get stuck and Terraform is not working as you expect, you can start over by deleting the terraform.tfstate
file and manually destroying the resources that were created (e.g. through the control panel).
Terraform is now configured and can be connected to your DigitalOcean account. In the next step, you’ll use Terraform to define a Droplet that will run an Nginx server.
You can use Terraform to create a DigitalOcean Droplet and install software on the Droplet once it spins up. In this step, you’ll provision a single Ubuntu 20.04 Droplet and install the Nginx web server using Terraform.
Create a new Terraform configuration file called www-1.tf
, which will hold the configuration for the Droplet:
nano www-1.tf
Insert the following lines to define the Droplet resource:
resource "digitalocean_droplet" "www-1" {
image = "ubuntu-20-04-x64"
name = "www-1"
region = "nyc3"
size = "s-1vcpu-1gb"
ssh_keys = [
data.digitalocean_ssh_key.terraform.id
]
In the preceding configuration, the first line defines a digitalocean_droplet resource named www-1
. The rest of the lines specify the Droplet’s attributes, including the data centre it will be residing in and the slug that identifies the size of the Droplet you want to configure. In this case, you’re using s-1vcpu-1gb
, which will create a Droplet with one CPU and 1GB of RAM. (Visit this size slug chart to see the available slugs you can use.)
The ssh_keys
section specifies a list of public keys you want to add to the Droplet. In this case, you’re specifying the key you defined in provider.tf
. Ensure the name here matches the name you specified in provider.tf
.
When you run Terraform against the DigitalOcean API, it will collect a variety of information about the Droplet, such as its public and private IP addresses. This information can be used by other resources in your configuration.
If you are wondering which arguments are required or optional for a Droplet resource, please refer to the official Terraform documentation: DigitalOcean Droplet Specification.
To set up a connection
that Terraform can use to connect to the server via SSH, add the following lines at the end of the file:
...
connection {
host = self.ipv4_address
user = "root"
type = "ssh"
private_key = file(var.pvt_key)
timeout = "2m"
}
These lines describe how Terraform should connect to the server, so Terraform can connect over SSH to install Nginx. Note the use of the private key variable var.pvt_key
—you’ll pass its value in when you run Terraform.
Now that you have the connection set up configure the remote-exec
provisioner, which you’ll use to install Nginx. Add the following lines to the configuration to do just that:
...
provisioner "remote-exec" {
inline = [
"export PATH=$PATH:/usr/bin",
# install nginx
"sudo apt update",
"sudo apt install -y nginx"
]
}
}
Note that the strings in the inline
array are the commands that the root user will run to install Nginx.
The completed file looks like this:
resource "digitalocean_droplet" "www-1" {
image = "ubuntu-20-04-x64"
name = "www-1"
region = "nyc3"
size = "s-1vcpu-1gb"
ssh_keys = [
data.digitalocean_ssh_key.terraform.id
]
connection {
host = self.ipv4_address
user = "root"
type = "ssh"
private_key = file(var.pvt_key)
timeout = "2m"
}
provisioner "remote-exec" {
inline = [
"export PATH=$PATH:/usr/bin",
# install nginx
"sudo apt update",
"sudo apt install -y nginx"
]
}
}
Save the file and exit the editor. You’ve defined the server and are ready to deploy it, which you’ll now do.
Your current Terraform configuration describes a single Nginx server. You’ll now deploy the Droplet exactly as it’s defined.
Run the terraform plan
command to see the execution plan, or what Terraform will attempt to do to build the infrastructure you described. You will have to specify the values for your DigitalOcean Access Token and the path to your private key, as your configuration uses this information to access your Droplet to install Nginx. Run the following command to create a plan:
terraform plan \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
Warning: The terraform plan
command supports an -out
parameter to save the plan. However, the plan will store API keys, and Terraform does not encrypt this data. When using this option, you should explore encrypting this file if you plan to send it to others or leave it at rest for an extended period.
You’ll see output similar to this:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.www-1 will be created
+ resource "digitalocean_droplet" "www-1" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ graceful_shutdown = false
+ id = (known after apply)
+ image = "ubuntu-20-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "www-1"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "nyc3"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ ssh_keys = [
+ "...",
]
+ status = (known after apply)
+ urn = (known after apply)
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
───────────────────────────────────────────────────────────────
Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform apply" now.
The + resource "digitalocean_droplet" "www-1"
line means that Terraform will create a new Droplet resource called www-1
, with the details that follow it. That’s exactly what should happen, so run terraform apply
command to execute the current plan:
terraform apply \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
You’ll get the same output as before, but this time, Terraform will ask you if you want to proceed:
Output...
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
Enter yes
and press ENTER
. Terraform will provision your Droplet:
Outputdigitalocean_droplet.www-1: Creating...
After a bit of time, you’ll see Terraform installing Nginx with the remote-exec
provisioner, and then the process will complete:
Output
digitalocean_droplet.www-1: Provisioning with 'remote-exec'...
....
digitalocean_droplet.www-1: Creation complete after 1m54s [id=your_www-1_droplet_id]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
...
Terraform has created a new Droplet called www-1
and installed Nginx on it. If you visit the public IP address of your new Droplet, you’ll see the Nginx welcome screen. The public IP was displayed when the Droplet was created, but you can always view it by looking at Terraform’s current state. Terraform updates the state file terraform.tfstate
every time it executes a plan or refreshes its state.
To view the current state of your environment, use the following command:
terraform show terraform.tfstate
This will show you the public IP address of your Droplet.
Outputresource "digitalocean_droplet" "www-1" {
backups = false
created_at = "..."
disk = 25
id = "your_www-1_droplet_id"
image = "ubuntu-20-04-x64"
ipv4_address = "your_www-1_server_ip"
ipv4_address_private = "10.128.0.2"
...
Navigate to http://your_www-1_server_ip
in your browser to verify your Nginx server is running.
Note: If you modify your infrastructure outside of Terraform, your state file will be out of date. If your resources are modified outside of Terraform, you’ll need to refresh the state file to bring it up to date. This command will pull the updated resource information from your provider(s):
terraform refresh \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
In this step, you’ve deployed the Droplet that you’ve described in Terraform. You’ll now create a second one.
Now that you have described a Nginx server, you can add a second quickly by copying the existing server’s configuration file and replacing the name and hostname of the Droplet resource.
You can do this manually, but it’s faster to use the sed
command to read the www-1.tf
file, substitute all instances of www-1
with www-2
, and create a new file called www-2.tf
. Here is the sed
command to do that:
sed 's/www-1/www-2/g' www-1.tf > www-2.tf
You can learn more about sed
by visiting Using sed.
Run terraform plan
again to preview the changes that Terraform will make:
terraform plan \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
The output shows that Terraform will create the second server, www-2
:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.www-2 will be created
+ resource "digitalocean_droplet" "www-2" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "www-2"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = true
+ region = "nyc3"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ ssh_keys = [
+ "...",
]
+ status = (known after apply)
+ urn = (known after apply)
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
...
Run terraform apply
again to create the second Droplet:
terraform apply \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
As before, Terraform will ask you to confirm you wish to proceed. Review the plan again and type yes
to continue.
After some time, Terraform will create the new server and display the results:
Outputdigitalocean_droplet.www-2: Creation complete after 1m47s [id=your_www-2_droplet_id]
...
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Terraform created the new server while not altering the existing one. You can repeat this step to add additional Nginx servers.
Now that you have two Droplets running Nginx, you’ll define and deploy a load balancer to split traffic between them.
You’ll use a DigitalOcean Load Balancer, which the official Terraform provider supports, to route traffic between the two web servers.
Create a new Terraform configuration file called loadbalancer.tf
:
nano loadbalancer.tf
Add the following lines to define the Load Balancer:
resource "digitalocean_loadbalancer" "www-lb" {
name = "www-lb"
region = "nyc3"
forwarding_rule {
entry_port = 80
entry_protocol = "http"
target_port = 80
target_protocol = "http"
}
healthcheck {
port = 22
protocol = "tcp"
}
droplet_ids = [digitalocean_droplet.www-1.id, digitalocean_droplet.www-2.id ]
}
The Load Balancer definition specifies its name, the data centre it will be in, the ports it should listen on to balance traffic, the configuration for the health check, and the IDs of the Droplets it should balance, which you fetch using Terraform variables.
Then, define a status check to verify that the Load Balancer is indeed available after deployment:
check "health_check" {
data "http" "lb_check" {
url = "http://${digitalocean_loadbalancer.www-lb.ip}"
}
assert {
condition = data.http.lb_check.status_code == 200
error_message = "${data.http.lb_check.url} returned an unhealthy status code"
}
}
This status check requests the IP address of the Load Balancer through HTTP and verifies that the return code is 200
, which signifies that the droplets are healthy and available. In case of an error or different return code, a warning will be displayed after the deployment process.
When you’re done, save and close the file.
Run terraform plan
command again to review the new execution plan:
terraform plan \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
You’ll see several lines of output, including the following lines:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
<= read (data resources)
Terraform will perform the following actions:
# data.http.lb_check will be read during apply
# (config refers to values not yet known)
<= data "http" "lb_check" {
+ body = (known after apply)
+ id = (known after apply)
+ response_body = (known after apply)
+ response_body_base64 = (known after apply)
+ response_headers = (known after apply)
+ status_code = (known after apply)
+ url = (known after apply)
}
# digitalocean_loadbalancer.www-lb will be created
+ resource "digitalocean_loadbalancer" "www-lb" {
+ algorithm = "round_robin"
+ disable_lets_encrypt_dns_records = false
+ droplet_ids = [
+ ...,
+ ...,
]
+ enable_backend_keepalive = false
+ enable_proxy_protocol = false
+ http_idle_timeout_seconds = (known after apply)
+ id = (known after apply)
+ ip = (known after apply)
+ name = "www-lb"
+ project_id = (known after apply)
+ redirect_http_to_https = false
+ region = "nyc3"
+ size_unit = (known after apply)
+ status = (known after apply)
+ urn = (known after apply)
+ vpc_uuid = (known after apply)
+ forwarding_rule {
+ certificate_id = (known after apply)
+ certificate_name = (known after apply)
+ entry_port = 80
+ entry_protocol = "http"
+ target_port = 80
+ target_protocol = "http"
+ tls_passthrough = false
}
+ healthcheck {
+ check_interval_seconds = 10
+ healthy_threshold = 5
+ port = 22
+ protocol = "tcp"
+ response_timeout_seconds = 5
+ unhealthy_threshold = 3
}
}
Plan: 1 to add, 0 to change, 0 to destroy.
╷
│ Warning: Check block assertion known after apply
│
│ on loadbalancer.tf line 27, in check "health_check":
│ 27: condition = data.http.lb_check.status_code == 200
│ ├────────────────
│ │ data.http.lb_check.status_code is a number
│
│ The condition could not be evaluated at this time, a result will be known when this plan is applied.
╵
...
Since the www-1
and www-2
Droplets already exist, Terraform will create the www-lb
Load Balancer and run a check after it’s provisioned.
Before deploying, you’ll need to reinitialize the project to add the http
dependency used in the health_check
:
terraform init -upgrade
Then, run terraform apply
to build the Load Balancer:
terraform apply \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
Once again, Terraform will ask you to review the plan. Approve the plan by entering yes
to continue.
Once you do, you’ll see output that contains the following lines, truncated for brevity:
Output...
digitalocean_loadbalancer.www-lb: Creating...
...
digitalocean_loadbalancer.www-lb: Creation complete after 1m18s [id=your_load_balancer_id]
data.http.lb_check: Reading...
data.http.lb_check: Read complete after 0s [id=http://lb-ip]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
...
Notice that the lb_check
has completed successfully.
Use terraform show terraform.tfstate
to locate the IP address of your Load Balancer:
terraform show terraform.tfstate
You’ll find the IP under the www-lb
entry:
Output...
# digitalocean_loadbalancer.www-lb:
resource "digitalocean_loadbalancer" "www-lb" {
algorithm = "round_robin"
disable_lets_encrypt_dns_records = false
droplet_ids = [
your_www-1_droplet_id,
your_www-2_droplet_id,
]
enable_backend_keepalive = false
enable_proxy_protocol = false
id = "your_load_balancer_id"
ip = "your_load_balancer_ip"
name = "www-lb"
...
Navigate to http://your_load_balancer_ip
in your browser and you’ll see an Nginx welcome screen because the Load Balancer is sending traffic to one of the two Nginx servers.
You’ll now learn how to configure DNS for your DigitalOcean account using Terraform.
In addition to Droplets and Load Balancers, Terraform can also create DNS domain and record domains. For example, if you want to point your domain to your Load Balancer, you can write the configuration describing that relationship.
Note: Use your own, unique domain name or Terraform will be unable to deploy the DNS resources. Be sure your domain is pointed to DigitalOcean nameservers.
Create a new file to describe your DNS:
nano domain_root.tf
Add the following domain resource, replacing your_domain
with your domain name:
resource "digitalocean_domain" "default" {
name = "your_domain"
ip_address = digitalocean_loadbalancer.www-lb.ip
}
Save and close the file when you’re done.
You can also add a CNAME record that points www.your_domain
to your_domain
. Create a new file for the CNAME record:
nano domain_cname.tf
Add these lines to the file:
resource "digitalocean_record" "CNAME-www" {
domain = digitalocean_domain.default.name
type = "CNAME"
name = "www"
value = "@"
}
Save and close the file when you’re done.
To add the DNS entries, run terraform plan
followed by terraform apply
, as with the other resources.
Navigate to your domain name and you’ll see an Nginx welcome screen because the domain is pointing to the Load Balancer, which is sending traffic to one of the two Nginx servers.
Although not commonly used in production environments, Terraform can also destroy infrastructure that it created. This is mainly useful in development environments that are deployed and destroyed multiple times.
First, create an execution plan to destroy the infrastructure by using terraform plan -destroy
:
terraform plan -destroy -out=terraform.tfplan \
-var "do_token=${DO_PAT}" \
-var "pvt_key=$HOME/.ssh/id_rsa"
Terraform will output a plan with resources marked in red, and prefixed with a minus sign, indicating that it will delete the resources in your infrastructure.
Then, use terraform apply
to run the plan:
terraform apply terraform.tfplan
Terraform will proceed to destroy the resources, as indicated in the generated plan.
In this tutorial, you used Terraform to build a load-balanced web infrastructure on DigitalOcean, with two Nginx web servers running behind a DigitalOcean Load Balancer. You know how to create and destroy resources, view the current state, and use Terraform to configure DNS entries.
Now that you understand how Terraform works, you can create configuration files that describe a server infrastructure for your projects. The example in this tutorial is a good starting point that demonstrates how you can automate the deployment of servers. If you already use provisioning tools, you can integrate them with Terraform to configure servers as part of their creation process instead of using the provisioning method used in this tutorial.
Terraform has many more features and can work with other providers. Check out the official Terraform Documentation to learn more about how you can use Terraform to improve your infrastructure.
If you would like to learn more about Terraform, check out our How To Manage Infrastructure with Terraform series.
]]>Terraform modules allow you to group distinct resources of your infrastructure into a single, unified resource. You can reuse them later with possible customizations without repeating the resource definitions each time you need them, which is beneficial to large and complexly structured projects. You can customize module instances using input variables you define as well as extract information from them using outputs. Aside from creating your custom modules, you can also use the pre-made modules published publicly at the Terraform Registry. Developers can use and customize them using inputs like the modules you create, but their source code is stored in and pulled from the cloud.
In this tutorial, you’ll create a Terraform module that will set up multiple Droplets behind a Load Balancer for redundancy. You’ll also use the for_each
and count
looping features of the Hashicorp Configuration Language (HCL) to deploy multiple customized instances of the module at the same time.
terraform-modules
, instead of loadbalance
. During Step 2, do not include the pvt_key
variable and the SSH key resource.Note: This tutorial has specifically been tested with Terraform 1.7.2
.
In this section, you’ll learn what benefits modules bring, where they are usually placed in the project, and how they should be structured.
Custom Terraform modules are created to encapsulate connected components that are used and deployed together frequently in bigger projects. They are self-contained, bundling only the resources, variables, and providers they need.
Modules are typically stored in a central folder at the root of the project, each in its respective subfolder underneath. In order to retain a clean separation between modules, always architect them to have a single purpose and make sure they never contain submodules. Packaging a single resource as a module can be superfluous and gradually removes the simplicity of the overall architecture. For small development and test projects, incorporating modules is not necessary because they do not bring much improvement in those cases. Modules also offer the benefit that definitions only need modification in one place, which will then be propagated through the rest of the infrastructure.
Next, you’ll define, use, and customize modules in your Terraform projects.
In this section, you’ll define multiple Droplets and a Load Balancer as Terraform resources and package them into a module. You’ll also make the resulting module customizable using module inputs.
You’ll store the module in a directory named droplet-lb
, under a directory called modules
. Assuming you are in the terraform-modules
directory you created as part of the prerequisites, create both at once by running:
mkdir -p modules/droplet-lb
The -p
argument instructs mkdir
to create all directories in the supplied path.
Navigate to it:
cd modules/droplet-lb
As was noted in the previous section, modules contain the resources and variables they use. Starting from Terraform 0.13
, they must also include definitions of the providers they use. Modules do not require any special configuration to note that the code represents a module, as Terraform regards every directory containing HCL code as a module, even the root directory of the project.
Variables defined in a module are exposed as its inputs and can be used in resource definitions to customize them. The module you’ll create will have two inputs: the number of Droplets to create and the name of their group. Create and open for editing a file called variables.tf
where you’ll store the variables:
nano variables.tf
Add the following lines:
variable "droplet_count" {}
variable "group_name" {}
Save and close the file.
You’ll store the Droplet definition in a file named droplets.tf
. Create and open it for editing:
nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "droplets" {
count = var.droplet_count
image = "ubuntu-22-04-x64"
name = "${var.group_name}-${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
lifecycle {
precondition {
condition = var.droplet_count >= 2
error_message = "At least two droplets must be created."
}
}
}
For the count
parameter, which specifies how many instances of a resource to create, you pass in the droplet_count
variable. Its value will be specified when the module is called from the main project code. The name of each of the deployed Droplets will be different, which you achieve by appending the index of the current Droplet to the supplied group name. Deployment of the Droplets will be in the fra1
region, and they will run Ubuntu 22.04.
The lifecycle
section contains a precondition
, which runs before the resources are actually deployed. Here, it verifies that at least two Droplets will be created - having only one defeats the purpose of the Load Balancer. Another example of validations can be found in the k8s-bootstrapper repository, which contains templates for setting up a DigitalOcean Kubernetes cluster using Terraform. There, validations are used to ensure that the number of nodes in the cluster is within limits.
When you are done, save and close the file.
With the Droplets now defined, you can move on to creating the Load Balancer. You’ll store its resource definition in a file named lb.tf
. Create and open it for editing by running:
nano lb.tf
Add its resource definition:
resource "digitalocean_loadbalancer" "www-lb" {
name = "lb-${var.group_name}"
region = "fra1"
forwarding_rule {
entry_port = 80
entry_protocol = "http"
target_port = 80
target_protocol = "http"
}
healthcheck {
port = 22
protocol = "tcp"
}
droplet_ids = [
for droplet in digitalocean_droplet.droplets:
droplet.id
]
}
You define the Load Balancer with the group name in its name in order to make it distinguishable. You deploy it in the fra1
region together with the Droplets. The next two sections specify the target and monitoring ports and protocols.
The highlighted droplet_ids
block takes in the IDs of the Droplets, which should be managed by the Load Balancer. Since there are multiple Droplets, and their count is not known in advance, you use a for
loop to traverse the collection of Droplets (digitalocean_droplet.droplets
) and take their IDs. You surround the for
loop with brackets ([]
) so that the resulting collection will be a list.
Save and close the file.
You’ve now defined the Droplet, Load Balancer, and variables for your module. You’ll need to define the provider requirements, specifying which providers the module uses, including their version and where they are located. Since Terraform 0.13
, modules must explicitly define the sources of non-Hashicorp maintained providers they use; this is because they do not inherit them from the parent project.
You’ll store the provider requirements in a file named provider.tf
. Create it for editing by running the following:
nano provider.tf
Add the following lines to require the digitalocean
provider:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
Save and close the file when you’re done. The droplet-lb
module now requires the digitalocean
provider.
Modules also support outputs, which you can use to extract internal information about the state of their resources. You’ll define an output that exposes the IP address of the Load Balancer and store it in a file named outputs.tf
. Create it for editing:
nano outputs.tf
Add the following definition:
output "lb_ip" {
value = digitalocean_loadbalancer.www-lb.ip
}
This output retrieves the IP address of the Load Balancer. Save and close the file.
The droplet-lb
module is now functionally complete and ready for deployment. You’ll call it from the main code, which you’ll store in the root of the project. First, navigate to it by going upward through your file directory two times:
cd ../..
Then, create and open for editing a file called main.tf
, in which you’ll use the module:
nano main.tf
Add the following lines:
module "groups" {
source = "./modules/droplet-lb"
droplet_count = 3
group_name = "group1"
}
output "loadbalancer-ip" {
value = module.groups.lb_ip
}
In this declaration, you invoke the droplet-lb
module located in the directory specified as source
. You configure the input it provides, droplet_count
and group_name
, which is set to group1
so you’ll later be able to discern between instances.
Since the Load Balancer IP output is defined in a module, it won’t automatically be shown when you apply the project. The solution to this is to create another output retrieving its value (loadbalancer_ip
).
Save and close the file when you’re done.
Initialize the module by running:
terraform init
The output will look like this:
OutputInitializing modules...
- groups in modules/droplet-lb
Initializing the backend...
Initializing provider plugins...
- Finding digitalocean/digitalocean versions matching "~> 2.0"...
- Installing digitalocean/digitalocean v2.34.1...
- Installed digitalocean/digitalocean v2.34.1 (signed by a HashiCorp partner, key ID F82037E524B9C0E8)
...
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
You can try planning the project to see what actions Terraform would take by running:
terraform plan -var "do_token=${DO_PAT}"
The output will be similar to this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.groups.digitalocean_droplet.droplets[0] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "group1-0"
...
}
# module.groups.digitalocean_droplet.droplets[1] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "group1-1"
...
}
# module.groups.digitalocean_droplet.droplets[2] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "group1-2"
...
}
# module.groups.digitalocean_loadbalancer.www-lb will be created
+ resource "digitalocean_loadbalancer" "www-lb" {
...
+ name = "lb-group1"
...
}
Plan: 4 to add, 0 to change, 0 to destroy.
...
This output details that Terraform would create three Droplets, named group1-0
, group1-1
, and group1-2
, and would also create a Load Balancer called group1-lb
, which will manage the traffic to and from the three Droplets.
You can try applying the project to the cloud by running:
terraform apply -var "do_token=${DO_PAT}"
Enter yes
when prompted. The output will show all the actions and the IP address of the Load Balancer will also be shown:
Outputmodule.groups.digitalocean_droplet.droplets[1]: Creating...
module.groups.digitalocean_droplet.droplets[0]: Creating...
module.groups.digitalocean_droplet.droplets[2]: Creating...
...
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Outputs:
loadbalancer-ip = ip_address
You’ve created a module containing a customizable number of Droplets and a Load Balancer that will automatically be configured to manage their ingoing and outgoing traffic.
In the previous section, you deployed the module you defined and called it groups
. If you ever wish to change its name, simply renaming the module call will not yield the expected results. Renaming the call will prompt Terraform to destroy and recreate resources, causing excessive downtime.
For example, open main.tf
for editing by running:
nano main.tf
Rename the groups
module to groups_renamed
, as highlighted:
module "groups_renamed" {
source = "./modules/droplet-lb"
droplet_count = 3
group_name = "group1"
}
output "loadbalancer-ip" {
value = module.groups_renamed.lb_ip
}
Save and close the file. Then, initialize the project again:
terraform init
You can now plan the project:
terraform plan -var "do_token=${DO_PAT}"
The output will be long but will look similar to this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
- destroy
Terraform will perform the following actions:
# module.groups.digitalocean_droplet.droplets[0] will be destroyed
...
# module.groups_renamed.digitalocean_droplet.droplets[0] will be created
...
Terraform will prompt you to destroy the existing instances and create new ones. This is destructive and unnecessary and may lead to unwanted downtime.
Instead, using the moved
block, you can instruct Terraform to move old resources under the new name. Open main.tf
for editing and add the following lines to the end of the file:
moved {
from = module.groups
to = module.groups_renamed
}
When you’re done, save and close the file.
You can now plan the project:
terraform plan -var "do_token=${DO_PAT}"
When you plan with the moved
block present in main.tf
, Terraform wants to move
the resources instead of recreate them:
OutputTerraform will perform the following actions:
# module.groups.digitalocean_droplet.droplets[0] has moved to module.groups_renamed.digitalocean_droplet.droplets[0]
...
# module.groups.digitalocean_droplet.droplets[1] has moved to module.groups_renamed.digitalocean_droplet.droplets[1]
...
Moving resources change their place in the Terraform state, meaning that the actual cloud resources won’t be modified, destroyed, or recreated.
Because you’ll modify the configuration significantly in the next step, destroy the deployed resources by running:
terraform destroy -var "do_token=${DO_PAT}"
Enter yes
when prompted. The output will end in:
Output...
Destroy complete! Resources: 4 destroyed.
In this section, you renamed resources in your Terraform project without destroying them in the process. You’ll now deploy multiple instances of a module from the same code using for_each
and count
.
In this section, you’ll use count
and for_each
to deploy the droplet-lb
module multiple times with customizations.
count
One way to deploy multiple instances of the same module at once is to pass in how many to the count
parameter, which is automatically available to every module. Open main.tf
for editing:
nano main.tf
Modify it to look like this, removing the existing output definition and moved
block:
module "groups" {
source = "./modules/droplet-lb"
count = 3
droplet_count = 3
group_name = "group1-${count.index}"
}
By setting count
to 3
, you instruct Terraform to deploy the module three times, each with a different group name. When you’re done, save and close the file.
Plan the deployment by running:
terraform plan -var "do_token=${DO_PAT}"
The output will be long, and will look like this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.groups[0].digitalocean_droplet.droplets[0] will be created
...
# module.groups[0].digitalocean_droplet.droplets[1] will be created
...
# module.groups[0].digitalocean_droplet.droplets[2] will be created
...
# module.groups[0].digitalocean_loadbalancer.www-lb will be created
...
# module.groups[1].digitalocean_droplet.droplets[0] will be created
...
# module.groups[1].digitalocean_droplet.droplets[1] will be created
...
# module.groups[1].digitalocean_droplet.droplets[2] will be created
...
# module.groups[1].digitalocean_loadbalancer.www-lb will be created
...
# module.groups[2].digitalocean_droplet.droplets[0] will be created
...
# module.groups[2].digitalocean_droplet.droplets[1] will be created
...
# module.groups[2].digitalocean_droplet.droplets[2] will be created
...
# module.groups[2].digitalocean_loadbalancer.www-lb will be created
...
Plan: 12 to add, 0 to change, 0 to destroy.
...
Terraform details in the output that each of the three module instances would have three Droplets and a Load Balancer associated with them.
for_each
You can use for_each
for modules when you require more complex instance customization or when the number of instances depends on third-party data (often presented as maps) that is not known while writing the code.
You’ll now define a map that pairs group names to Droplet counts and deploy instances of droplet-lb
according to it. Open main.tf
for editing by running:
nano main.tf
Modify the file to make it look like this:
variable "group_counts" {
type = map
default = {
"group1" = 1
"group2" = 3
}
}
module "groups" {
source = "./modules/droplet-lb"
for_each = var.group_counts
droplet_count = each.value
group_name = each.key
}
You first define a map called group_counts
that contains how many Droplets a given group should have. Then, you invoke the module droplet-lb
, but specify that the for_each
loop should operate on var.group_counts
, the map you’ve defined just before. droplet_count
takes each.value
, the value of the current pair, which is the count of Droplets for the current group. group_name
receives the name of the group.
Save and close the file when you’re done.
Try applying the configuration by running:
terraform plan -var "do_token=${DO_PAT}"
The output will detail the actions Terraform would take to create the two groups with their Droplets and Load Balancers:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.groups["group1"].digitalocean_droplet.droplets[0] will be created
...
# module.groups["group1"].digitalocean_loadbalancer.www-lb will be created
...
# module.groups["group2"].digitalocean_droplet.droplets[0] will be created
...
# module.groups["group2"].digitalocean_droplet.droplets[1] will be created
...
# module.groups["group2"].digitalocean_droplet.droplets[2] will be created
...
# module.groups["group2"].digitalocean_loadbalancer.www-lb will be created
...
In this step, you’ve used count
and for_each
to deploy multiple customized instances of the same module from the same code.
In this tutorial, you created and deployed Terraform modules. You used modules to group logically linked resources together and customized them in order to deploy multiple different instances from a central code definition. You also used outputs to show attributes of resources contained in the module.
If you would like to learn more about Terraform, check out our How To Manage Infrastructure with Terraform series.
]]>Snyk was designed to serve as a developer security platform and with flexibility in mind. Its main goal is to help you detect and fix vulnerabilities in your application source code, third party dependencies, container images, and infrastructure configuration files (e.g. Kubernetes, Terraform, etc).
Snyk is divided into four components:
Snyk can be run in different ways:
Is Snyk free?
Yes, the tooling is free, except Snyk API and some advanced features from the web UI (such as advanced reporting). There is also a limitation on the number of tests you can perform per month.
See pricing plans for more information.
Is Snyk open source?
Yes, the tooling and Snyk CLI for sure is. You can visit the Snyk GitHub home page to find more details about each component implementation. The cloud portal and all paid features such as the rest API implementation is not open source.
Another important set of concepts that is Snyk is using are Targets and Projects.
Targets represent an external resource Snyk has scanned through an integration, the CLI, UI or API. Example targets are a SCM repository, a Kubernetes workload, etc.
Projects on the other hand, define the items Snyk scans at a given Target. A project includes:
You can read more about Snyk core concepts here.
In this guide you will use Snyk CLI to perform risk analysis for your Kubernetes applications supply chain (container images, Kubernetes YAML manifests). Then, you will learn how to take the appropriate action to remediate the situation. Finally, you will learn how to integrate Snyk in a CI/CD pipeline to scan for vulnerabilities in the early stages of development.
To complete all steps from this guide, you will need:
DOKS
cluster running Kubernetes version >=1.21
that you have access to. For additional instructions on configuring a DigitalOcean Kubernetes cluster, see: How to Set Up a DigitalOcean Managed Kubernetes Cluster (DOKS).Kubernetes
interaction. Follow these instructions to connect to your cluster with kubectl
and doctl
.You can manually scan for vulnerabilities via the snyk
command line interface. The snyk CLI is designed to be used in various scripts and automations. A practical example is in a CI/CD pipeline implemented using various tools such as Tekton, Jenkins, GitHub Workflows, etc.
When the snyk CLI is invoked it will immediately start the scanning process and report back issues in a specific format. By default it will print a summary table using the standard output or the console. Snyk can generate reports in other formats as well, such as JSON, HTML, SARIF, etc.
You can opt to push the results to the Snyk Cloud Portal (or web UI) via the --report
flag to store and visualize scan results later.
Note: It’s not mandatory to submit scan results to the Snyk cloud portal. The big advantage of using the Snyk portal is visibility because it gives you access to a nice dashboard where you can check all scan reports and see how much the Kubernetes supply chain is impacted. It also helps you on the long term with investigations and remediation hints.
Snyk CLI is divided into several subcommands. Each subcommand is dedicated to a specific feature, such as:
Before moving on, please make sure to create a free account using the Snyk web UI. Also, snyk CLI needs to be authenticated with your cloud account as well in order for some commands/subcommands to work (e.g. snyk code test
).
A few examples to try with Snyk CLI:
# Scans your project code from current directory
snyk test
# Scan a specific path from your project directory (make sure to replace the `<>` placeholders accordingly)
snyk test <path/to/dir>
# Scan your project code from current directory
snyk code test
# Scan a specific path from your project directory (make sure to replace the `<>` placeholders accordingly)
snyk code test <path/to/dir>
# Scans the debian docker image by pulling it first
snyk container debian
# Give more context to the scanner by providing a Dockerfile (make sure to replace the `<>` placeholders accordingly)
snyk container debian --file=<path/to/dockerfile>
# Scan your project code from current directory
snyk iac test
# Scan a specific path from your project directory (make sure to replace the `<>` placeholders accordingly)
snyk iac test <path/to/dir>
# Scan Kustomize based projects (first you need to render the final template, then pass it to the scanner)
kustomize build > kubernetes.yaml
snyk iac test kubernetes.yaml
Snyk CLI provides help pages for all available options. The below command can be used to print the main help page:
snyk --help
The output looks similar to:
OutputCLI commands help
Snyk CLI scans and monitors your projects for security vulnerabilities and license issues.
For more information visit the Snyk website https://snyk.io
For details see the CLI documentation https://docs.snyk.io/features/snyk-cli
How to get started
1. Authenticate by running snyk auth
2. Test your local project with snyk test
3. Get alerted for new vulnerabilities with snyk monitor
Available commands
To learn more about each Snyk CLI command, use the --help option, for example, snyk auth --help or
snyk container --help
snyk auth
Authenticate Snyk CLI with a Snyk account.
snyk test
Test a project for open source vulnerabilities and license issues.
Each snyk CLI command (or subcommand) has an associated help page as well, which can be accessed via snyk [command] --help
.
Please visit the official snyk CLI documentation page for more examples.
After you sign up for a Snyk account, authenticate and log in to Snyk, the Web UI opens to the Dashboard, with a wizard to guide you through setup steps:
The following features are available via the web UI:
Please visit the official documentation page to learn more about the Snyk web UI.
On each scan, snyk
verifies your resources for potential security risks and how each impacts your system. A severity level is applied to a vulnerability to indicate the risk for that vulnerability in an application.
Severity levels can take one of the below values:
The Common Vulnerability Scoring System (CVSS) determines the severity level of a vulnerability. Snyk uses CVSS framework version 3.1 to communicate the characteristics and severity of vulnerabilities.
The below table shows each severity level mapping:
Severity level | CVSS score |
---|---|
Low | 0.0 - 3.9 |
Medium | 4.0 - 6.9 |
High | 7.0 - 8.9 |
Critical | 9.0 - 10.10 |
In this guide, the medium level threshold is used as the default value in the example CI/CD pipeline being used. Usually, you will want to asses high and critical issues first, but in some cases medium level needs some attention as well. In terms of security and as a general rule of thumb, you will usually want to be very strict.
Please visit the official documentation page to learn more about severity levels.
Another useful feature provided by the Snyk web UI is security issues remediation assistance. It means you receive a recommendation about how to fix each security issue found by the snyk
scanner. This is very important because it simplifies the process and closes the loop for each iteration that you need to perform to fix each reported security issue.
The below picture illustrates this process better:
For each reported issue, there is a button that you can click on and get remediation assistance:
The main procedure is the same for each reported issue. It means, you click on the show details button, then take the suggested steps to apply the fix.
How do you benefit from embedding a security compliance scanning tool in your CI/CD pipeline and avoid unpleasant situations in a production environment?
It all starts at the foundation level, where software development starts. In general, you will want to use a dedicated environment for each stage. So, in the early stages of development when application code changes very often, you should use a dedicated development environment (called the lower environment usually). Then, the application gets more and more refined in the QA environment, where QA teams perform manual and/or automated testing. Next, if the application gets the QA team’s approval, it is promoted to the upper environments, such as staging, and finally into production. In this process, where the application is promoted from one environment to another, a dedicated pipeline runs, which continuously scans application artifacts and checks the severity level. If the severity level doesn’t meet a specific threshold, the pipeline fails immediately and application artifacts promotion to production is stopped in the early stages.
So, the security scanning tool (e.g., snyk) acts as a gatekeeper stopping unwanted artifacts from getting into your production environment from the early stages of development. In the same manner, upper environments pipelines use snyk
to allow or forbid application artifacts from entering the final production stage.
In this step, you will learn how to create and test a sample CI/CD pipeline with integrated vulnerability scanning via GitHub workflows. To learn the fundamentals of using Github Actions with DigitalOcean Kubernetes, refer to this tutorial.
The pipeline provided in the following section builds and deploys the game-2048-example application from the DigitalOcean kubernetes-sample-apps repository.
At a high level overview, the game-2048 CI/CD workflow provided in the kubernetes-sample-apps repo is comprised of the following stages:
The below diagram illustrates each job from the pipeline and the associated steps with actions (only relevant configuration is shown):
Notes:
kustomize
works - it gathers all configuration fragments from each overlay and applies them over a base to build the final compound.kustomize
configurations. This way, it’s easier to identify what resource needs to be fixed in your repository. Remote resources used by kustomize need to be fixed upstream. Also, Kubernetes secrets and ConfigMaps generated via kustomize
are not captured.How do you fail the pipeline if a certain security compliance level is not met?
Snyk CLI provides a flag named --severity-threshold
for this purpose. This flag correlates with the overall severity level computed after each scan. In the case of Snyk, the severity level takes one of the following values: low, medium, high, or critical. You can fail or pass the pipeline based on the severity level value and stop application deployment if conditions are not met.
The below picture illustrates the flow for the example CI/CD pipeline used in this guide:
Please follow below steps to create and test the snyk CI/CD GitHub workflow provided in the kubernetes-sample-apps GitHub repository:
DIGITALOCEAN_ACCESS_TOKEN
- holds your DigitalOcean account token.DOCKER_REGISTRY
- holds your DigitalOcean docker registry name including the endpoint (e.g. registry.digitalocean.com/sample-apps
).DOKS_CLUSTER
- holds your DOKS cluster name. You can run the following command to get your DOKS cluster name: doctl k8s cluster list --no-header --format Name
.SNYK_TOKEN
- holds your Snyk user account ID - run: snyk config get api
to get the ID. If that doesn’t work, you can retrieve the token from your user account settings page.SLACK_WEBHOOK_URL
- holds your Slack incoming webhook URL used for snyk scan notifications.A new entry should appear in the below list after clicking the Run Workflow green button. Select the running workflow to observe the pipeline progress:
The pipeline will fail and stop when the snyk-container-security-check job runs. This is expected because the default severity level value used in the workflow input, which is medium, doesn’t meet the expectations. You should also receive a Slack notification with details about the workflow run:
In the next steps, you will learn how to investigate the snyk
scan report to fix the issues, lower the severity level, and pass the pipeline.
Whenever the severity level threshold is not met, the game-2048 GitHub workflow will fail, and a Slack notification is sent with additional details. You also get security reports published to GitHub and accessible in the Security tab of your project repository.
The game-2048 workflow runs two security checks:
snyk container test <GAME-2048-IMAGE>:<TAG> --file=/path/to/game-2048/Dockerfile
.snyk iac test /path/to/project/kubernetes/manifests
.Thus, lowering the severity level and passing the workflow consists of:
Next, you will learn how to address each in turn.
The sample pipeline used in this guide runs security checks for the game-2048 container image and the associated Dockerfile via the snyk-container-security-check job.
The snyk-container-security-check job runs the following steps:
snyk_fail_threshold
input parameter if the workflow is manually triggered, or to SNYK_FAIL_THRESHOLD
environment variable, if workflow runs automatically.Below snippet shows the main logic of the snyk-container-security-check job:
- name: Build App Image for Snyk container scanning
uses: docker/build-push-action@v3
with:
context: ${{ env.PROJECT_DIR }}
push: false
tags: "${{ secrets.DOCKER_REGISTRY }}/${{ env.PROJECT_NAME }}:${{ github.sha }}"
- name: Check application container vulnerabilities
run: |
snyk container test "${{ secrets.DOCKER_REGISTRY }}/${{ env.PROJECT_NAME }}:${{ github.sha }}" \
--file=Dockerfile \
--severity-threshold=${{ github.event.inputs.snyk_fail_threshold || env.SNYK_FAIL_THRESHOLD }} \
--target-name=${{ env.PROJECT_NAME }} \
--target-reference=${{ env.ENVIRONMENT }} \
--sarif --sarif-file-output=snyk-container-scan.sarif
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
working-directory: ${{ env.PROJECT_DIR }}
- name: Upload Snyk report SARIF file
if: ${{ always() }}
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: ${{ env.PROJECT_DIR }}/snyk-container-scan.sarif
category: snyk-container-scan
In order to fix the reported issues, you need to check first the security tab of your kubernetes-sample-apps repository fork:
You will see a bunch of vulnerabilities for the base docker image in this case. Click on each to expand and see more details:
To finish investigations and see recommendations offered by Snyk, you need to inspect the snyk-container-security-check job output from the main workflow:
Note: Snyk container test offers the possibility to export results in SARIF format, but it doesn’t know how to upload reports to the Snyk cloud portal. On the other hand, snyk container monitor offers the possibility to upload results to the Snyk cloud portal, but it cannot export SARIF. So this guide is using snyk container test with SARIF exporting feature. Some recommendations are not available in the SARIF output unfortunately. So, you must also look in the job console output for recommendations.
The snyk-container-security-check job output shows that Snyk recommends to update the base image version from node:16-slim to node:18.6.0-slim. This change eliminates the high risk issue(s), and also lowers the number of other reported vulnerabilities from 70 to 44 - this is a substantial reduction of almost 50% !!!
Now, open the game-2048 application Dockerfile from your fork, and change the FROM directives to point to the new version (node:18.6.0-slim at this time of writing):
FROM node:18.6.0-slim AS builder
WORKDIR /usr/src/app
COPY . .
RUN npm install --include=dev
#
# Build mode can be set via NODE_ENV environment variable (development or production)
# See project package.json and webpack.config.js
#
ENV NODE_ENV=development
RUN npm run build
FROM node:18.6.0-slim
RUN npm install http-server -g
RUN mkdir /public
WORKDIR /public
COPY /usr/src/app/dist/ ./
EXPOSE 8080
USER 1000
CMD ["http-server"]
Finally, commit changes to your GitHub repository and trigger the workflow again (leaving the default values on). This time the snyk-container-security-check job should pass:
Going to the security tab of your project, there should be no issues reported.
How do you make sure to reduce base image vulnerabilities in the future?
The best approach is to use a base image with a minimal footprint - the lesser the binaries or dependencies in the base image, the better. Another good practice is to continuously monitor your projects, as explained in the Monitor your Projects on a Regular Basis section of this guide.
You will notice that the pipeline still fails, but this time at the snyk-iac-security-check phase. This is expected because there are security issues with the Kubernetes manifests used to deploy the application. In the next section, you will learn how to investigate this situation and apply Snyk security recommendations to fix the reported issues.
The pipeline is still failing and stops at the snyk-iac-security-check job. This is expected because the default severity level value used in the workflow input, which is medium, doesn’t meet the security requirements for the project.
The snyk-iac-security-check job checks for Kubernetes manifests vulnerabilities (or misconfigurations), and executes the following steps:
snyk_fail_threshold
input parameter if the workflow is manually triggered, or to SNYK_FAIL_THRESHOLD
environment variable, if workflow runs automatically. Finally, the –report argument is also used to send scan results to the Snyk cloud portal.Below snippet shows the actual implementation of each step from the snyk-iac-security-check job:
- name: Check for Kubernetes manifests vulnerabilities
run: |
snyk iac test \
--severity-threshold=${{ github.event.inputs.snyk_fail_threshold || env.SNYK_FAIL_THRESHOLD }} \
--target-name=${{ env.PROJECT_NAME }} \
--target-reference=${{ env.ENVIRONMENT }} \
--sarif --sarif-file-output=snyk-iac-scan.sarif \
--report
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
working-directory: ${{ env.PROJECT_DIR }}
- name: Upload Snyk IAC SARIF file
if: ${{ always() }}
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: ${{ env.PROJECT_DIR }}/snyk-iac-scan.sarif
category: snyk-iac-scan
In order to fix the reported issues you have two options:
Either way, you will get recommendations about how to fix the reported issues.
For this guide, you will be using the Snyk cloud portal to investigate the reported security issues. First, click on the game-2048-example entry from the projects list, then select the kustomize/resources/deployment.yaml file:
Next, tick the Medium checkbox in the Severity submenu from the left to display only medium level issues:
Then, you can inspect each reported issue card and check the details. Go ahead and click on the Show more details button from the Container is running without root user control card - you will receive more details about the current issue, and important hints about how to fix it:
After collecting all information from each card, you can go ahead and edit the deployment.yaml file from your repo (located in the game-2048-example/kustomize/resources
subfolder). The fixes are already in place, you just need to uncomment the last lines from the file. The final deployment.yaml
file should look like below:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: game-2048
spec:
replicas: 1
selector:
matchLabels:
app: game-2048
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: game-2048
spec:
containers:
- name: backend
# Replace the `<>` placeholders with your docker registry info
image: registry.digitalocean.com/sample-apps/2048-game:latest
ports:
- name: http
containerPort: 8080
resources:
requests:
cpu: 100m
memory: 50Mi
limits:
cpu: 200m
memory: 100Mi
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
allowPrivilegeEscalation: false
capabilities:
drop:
- all
What changed? The following security fixes were applied:
readOnlyRootFilesystem
- runs container image in read only (cannot alter files by kubectl exec
in the container).runAsNonRoot
- runs as the non root user defined by the USER directive from the game-2048 project Dockerfile.allowPrivilegeEscalation
- setting allowPrivilegeEscalation to false ensures that no child process of a container can gain more privileges than its parent.capabilities.drop
- To make containers more secure, you should provide containers with the least amount of privileges it needs to run. In practice, you drop everything by default, then add required capabilities step by step. You can learn more about container capabilities here.Finally, commit the changes for the deployment.yaml file and push to main branch. After manually triggering the workflow it should complete successfully this time:
You should also receive a green Slack notification from the snyk scan job. Navigate to the Snyk portal link and check if the issues that you fixed recently are gone - there should be no medium level issues reported.
A few final checks can be performed as well on the Kubernetes side to verify if the reported issues were fixed:
Check if the game-2048 deployment has a read-only (immutable) filesystem by writing to the index.html file used by the game-2048 application:
kubectl exec -it deployment/game-2048 -n game-2048 -- /bin/bash -c "echo > /public/index.html"
The output looks similar to:
Output/bin/bash: /public/index.html: Read-only file system
command terminated with exit code 1
Check if the container runs as a non-root user (should print an integer number different than zero - e.g., 1000):
kubectl exec -it deployment/game-2048 -n game-2048 -- id -u
If all checks pass, then you applied the required security recommendations successfully.
The vulnerability scan automation you have implemented so far is a good starting point but not perfect. Why?
One issue with the current approach is that you never know when new issues are reported for the assets you already deployed in your environments. In other words, you assessed the security risks and took the measures to fix the issues at one specific point in time - when your CI/CD automation was executed.
But what if new issues are reported meanwhile, and your application is vulnerable again? Snyk helps you overcome this situation via the monitoring feature. The monitoring feature of Snyk helps you address new vulnerabilities, which are constantly disclosed. When combined with the Snyk Slack integration (explained in Step 6 - Enabling Slack Notifications), you can take immediate actions to fix newly disclosed issues that may affect your application in a production environment.
To benefit from this feature all you have to do is use the snyk monitor command before any deploy steps in your CI/CD pipeline. The syntax is very similar to the snyk test commands (one of the cool things about snyk CLI is that it was designed with uniformity in mind). The snyk monitor command will send a snapshot to the Snyk cloud portal, and from there you will get notified about newly disclosed vulnerabilities for your project.
In terms of the GitHub workflow automation, you can snyk monitor your application container in the snyk-container-security-check job, after testing for vulnerabilities. Below snippet shows a practical implementation for the pipeline used in this guide (some steps were omitted for clarity):
snyk-container-security-check:
runs-on: ubuntu-latest
needs: build-and-test-application
steps:
- name: Checkout
uses: actions/checkout@v3
...
- name: Check application container vulnerabilities
run: |
snyk container test "${{ secrets.DOCKER_REGISTRY }}/${{ env.PROJECT_NAME }}:${{ github.sha }}" \
--file=${{ env.PROJECT_DIR }}/Dockerfile \
--severity-threshold=${{ github.event.inputs.snyk_fail_threshold || env.SNYK_FAIL_THRESHOLD }} \
--target-name=${{ env.PROJECT_NAME }} \
--target-reference=${{ env.ENVIRONMENT }} \
--sarif-file-output=snyk-container-scan.sarif
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
- name: Monitor the application container using Snyk
run: |
snyk container monitor "${{ secrets.DOCKER_REGISTRY }}/${{ env.PROJECT_NAME }}:${{ github.sha }}" \
--file=${{ env.PROJECT_DIR }}/Dockerfile
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
...
Above snippet shows an additional step called Monitor the application container using Snyk where the actual snyk container monitor runs.
After the snyk monitor command runs, you can log in to the Snyk Web UI to see the latest snapshot and history of your project:
You can test and monitor your application source code as well in the build-and-test-application job. Below snippet shows an example implementation for the GitHub workflow used in this guide:
build-and-test-application:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: npm install, build, and test
run: |
npm install
npm run build --if-present
npm test
working-directory: ${{ env.PROJECT_DIR }}
- name: Snyk code test and monitoring
run: |
snyk test ${{ env.PROJECT_DIR }}
snyk monitor ${{ env.PROJECT_DIR }}
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
Next, you will receive Slack notifications on a regular basis about newly disclosed vulnerabilities for your project.
There are situations when you don’t want the final report to be affected by some issues that your team considers safe to ignore. Snyk offers a built-in feature to manage exceptions and overcome this situation.
You can read more about this feature here.
A more efficient approach is where you integrate vulnerability scan tools directly in your favorite IDE (or Integrated Development Environment). This way, you can detect and fix security issues ahead of time in the software development cycle.
Snyk offers support for a variety of IDEs, such as:
The above plugins will help you detect and fix issues in the early stages of development, thus eliminating frustration, costs, and security flaws in production systems. Also, it helps you to reduce the iterations end human effort on the long run. As an example, for each reported security issue by your CI/CD automation you need to go back and fix the issue in your code, commit changes, wait for the CI/CD automation again, then repeat in case of failure.
From the official documentation, you can read more about these features on the Snyk for IDEs page.
You can set the workflow to trigger automatically on each commit or PR against the main branch by uncommenting the following lines at the top of the game-2048-snyk.yaml file:
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
After editing the file, commit the changes to your main branch, and you should be ready to go.
You can set up Snyk to send Slack alerts about new vulnerabilities discovered in your projects and about new upgrades or patches that have become available.
To set it up, you will need to generate a Slack webhook. You can either do this via Incoming WebHooks or by creating your own Slack App. Once you have generated your Slack Webhook URL, go to your 'Manage organization’ settings, enter the URL, and click the Connect button:
In this guide, you learned how to use a pretty flexible and powerful Kubernetes vulnerability scanning tool - Snyk. Then, you learned how to integrate the Snyk vulnerability scanning tool in a traditional CI/CD pipeline implemented using GitHub workflows.
Finally, you learned how to investigate vulnerability scan reports, apply fixes to remediate the situation, and reduce security risks to a minimum via a practical example - the game-2048 application from the kubernetes-sample-apps repository.
You can learn more by reading the following additional resources:
I am writing to report a significant and sudden issue that has arisen with my website hosted on your service (Website: sportydeal.com). Starting at around 10 AM yesterday, we began receiving a large volume of identical emails from what appear to be fake addresses. Concurrently, our server’s CPU usage spiked from an average of 30% to 100%, where it has remained.
Here are the details of the incident: Start time of the issue: Approximately 10 AM on 28/11/2023. Nature of the problem: Influx of 100 identical emails from suspected fake addresses, simultaneous and sustained spike in CPU usage to 100%. Impact: The server’s performance is severely degraded, potentially affecting our customers’ experience and our business operations.
We suspect this may be due to a DDoS attack or a similar malicious activity targeting our site. We have taken the following steps: Checked our website’s code and configurations for any anomalies and checked the traffic on our web site : no up. Reviewed server logs around the time the issue began. Attempted to identify the source of the traffic/email, but it seems to be distributed.
We urgently need your assistance to: Investigate the source of this traffic and the high CPU usage. Implement measures to mitigate this issue and prevent future occurrences. Provide insights on any additional steps we should take to secure our server. Please find attached a screenshot of our server’s performance graphs showing the CPU usage spike. We appreciate your prompt attention to this critical matter. Best regards,
Yann Le CORRE Founder and CEO SportyDeal
]]>is there any resource (ex. article, document) where I can find info about data center energy efficiency and environment impact?
I’ve looked at e3p.jrc.ec.europa.eu/node/570 and e3p.jrc.ec.europa.eu/node/575 and it seems like DO does not adhere to The European Code of Conduct for Energy Efficiency.
I’ve also found this https://www.digitalocean.com/impact, but there’re to few details about measures adopted to prevent global warming.
I need this data for one of the projects that my company will have to carry out.
Thank you!
Best regard, Sergiu
]]>I have a DOKS cluster set up with multiple node pools and I am trying to figure out the best way to empower my customers to enable IP allowlist for my DOKS cluster.
Context: I have a bunch of workers running within my DOKS cluster that will try to access customer’s databases hosted on AWS, GCP, DigitalOcean, etc…
I would like to give them a set of IPs or CIDR ranges that they can turn on to enable me to access their network.
My current challenges: I have been sent to set up an Egress Gateway via Crossplane with DigitalOcean CRD (https://github.com/digitalocean/container-blueprints/tree/main/DOKS-Egress-Gateway).
This works fine, but I have a couple questions: (1) Does the StaticRoute CRD support multiple gateways with overlapping destinations ranges? (https://github.com/digitalocean/k8s-staticroute-operator) I have submitted an issue to the GitHub repo as well.
(2) Instead of specifying “destinations”, is there a way to specify exclude this destination range? That way, I can exclude DigitalOcean REST APIs and K8 private IPs. It will also save me from having to input 10k+ CIDR blocks (AWS has 7.5k IPv4 CIDR ranges).
I would love to hear if anyone has a better way to achieve my use case and/or have answers for above.
Thanks in advance, Robin.
]]>./iperf3 -c 206.189.127.20 --udp Connecting to host 206.189.127.20, port 5201 [ 4] local 10.14.19.8 port 52877 connected to 206.189.127.20 port 5201 [ ID] Interval Transfer Bandwidth Total Datagrams [ 4] 0.00-1.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 1.00-2.01 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 2.01-3.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 3.00-4.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 4.00-5.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 5.00-6.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 6.00-7.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 7.00-8.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 8.00-9.00 sec 128 KBytes 1.05 Mbits/sec 16 [ 4] 9.00-10.00 sec 128 KBytes 1.05 Mbits/sec 16
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 4] 0.00-10.00 sec 1.25 MBytes 1.05 Mbits/sec 59.160 ms 0/159 (0%)
Can someone help us make sense out of this? Same test done over TCP gives a speed of 170 Mbits/sec…
Thanks!
]]>Thanks
]]>In DevOps, the gold standard for building a deployment platform is using Infrastructure as Code and a GitOps workflow, and there are a couple of popular tools that can enable you and your team to create Infrastructure as Code files. In this talk, Amy will compare and contrast Terraform and Pulumi, she will demonstrate how to create a DigitalOcean Webserver with Terraform and then Pulumi, and finally she will talk through the benefits and drawbacks of each project.
To join the live Tech Talk, register here.
However, after scouring the internet for hours I can’t find any documentation about how this code is run, if it needs to be it’s own python process, how the worker is configured or any other basic information on how these are supposed to work.
Ideally I can just say “run this method” on my existing App Platform app and be done with it. If needed I’m fine with sending a network request to start the method (without a response according to the Worker documentation).
I just need somewhere to start. I have no idea how to code something the worker can even use.
]]>42:44] Project contains yarn.lock, using yarn
[2022-04-19 21:42:44] Warning: both yarn.lock and package-lock.json were found, using yarn.
[2022-04-19 21:42:44] Installing node_modules using yarn (from yarn.lock)
[2022-04-19 21:42:44] Running yarn install
[2022-04-19 21:42:44]
[2022-04-19 21:42:44] node: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.28' not found (required by node)
This only happened today.
Are there any issues with DO infrastructure today?
]]>Terraform modules allow you to group distinct resources of your infrastructure into a single, unified resource. You can reuse them later with possible customizations, without repeating the resource definitions each time you need them, which is beneficial to large and complexly structured projects. You can customize module instances using input variables you define as well as extract information from them using outputs. Aside from creating your own custom modules, you can also use the pre-made modules published publicly at the Terraform Registry. Developers can use and customize them using inputs like the modules you create, but their source code is stored in and pulled from the cloud.
In this tutorial, you’ll create a Terraform module that will set up multiple Droplets behind a Load Balancer for redundancy. You’ll also use the for_each
and count
looping features of the Hashicorp Configuration Language (HCL) to deploy multiple customized instances of the module at the same time.
terraform-modules
, instead of loadbalance
. During Step 2, do not include the pvt_key
variable and the SSH key resource.Note: This tutorial has specifically been tested with Terraform 1.1.3
.
In this section, you’ll learn what benefits modules bring, where they are usually placed in the project, and how they should be structured.
Custom Terraform modules are created to encapsulate connected components that are used and deployed together frequently in bigger projects. They are self-contained, bundling only the resources, variables, and providers they need.
Modules are typically stored in a central folder at the root of the project, each in its respective subfolder underneath. In order to retain a clean separation between modules, always architect them to have a single purpose and make sure they never contain submodules.
It is useful to create modules from your resource schemes when you find yourself repeating them with infrequent customizations. Packaging a single resource as a module can be superfluous and gradually removes the simplicity of the overall architecture.
For small development and test projects, incorporating modules is not necessary because they do not bring much improvement in those cases. With their ability for customization, modules are the building element of complexly structured projects. Developers use modules for larger projects because of the significant advantages in avoiding code duplication. Modules also offer the benefit that definitions only need modification in one place, which will then be propagated through the rest of the infrastructure.
Next you’ll define, use, and customize modules in your Terraform projects.
In this section, you’ll define multiple Droplets and a Load Balancer as Terraform resources and package them into a module. You’ll also make the resulting module customizable using module inputs.
You’ll store the module in a directory named droplet-lb
, under a directory called modules
. Assuming you are in the terraform-modules
directory you created as part of the prerequisites, create both at once by running:
- mkdir -p modules/droplet-lb
The -p
argument instructs mkdir
to create all directories in the supplied path.
Navigate to it:
- cd modules/droplet-lb
As was noted in the previous section, modules contain the resources and variables they use. Starting from Terraform 0.13
, they must also include definitions of the providers they use. Modules do not require any special configuration to note that the code represents a module, as Terraform regards every directory containing HCL code as a module, even the root directory of the project.
Variables defined in a module are exposed as its inputs and can be used in resource definitions to customize them. The module you’ll create will have two inputs: the number of Droplets to create and the name of their group. Create and open for editing a file called variables.tf
where you’ll store the variables:
- nano variables.tf
Add the following lines:
variable "droplet_count" {}
variable "group_name" {}
Save and close the file.
You’ll store the Droplet definition in a file named droplets.tf
. Create and open it for editing:
- nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "droplets" {
count = var.droplet_count
image = "ubuntu-20-04-x64"
name = "${var.group_name}-${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
}
For the count
parameter, which specifies how many instances of a resource to create, you pass in the droplet_count
variable. Its value will be specified when the module is called from the main project code. The name of each of the deployed Droplets will be different, which you achieve by appending the index of the current Droplet to the supplied group name. Deployment of the Droplets will be in the fra1
region and they will run Ubuntu 20.04.
When you are done, save and close the file.
With the Droplets now defined, you can move on to creating the Load Balancer. You’ll store its resource definition in a file named lb.tf
. Create and open it for editing by running:
- nano lb.tf
Add its resource definition:
resource "digitalocean_loadbalancer" "www-lb" {
name = "lb-${var.group_name}"
region = "fra1"
forwarding_rule {
entry_port = 80
entry_protocol = "http"
target_port = 80
target_protocol = "http"
}
healthcheck {
port = 22
protocol = "tcp"
}
droplet_ids = [
for droplet in digitalocean_droplet.droplets:
droplet.id
]
}
You define the Load Balancer with the group name in its name in order to make it distinguishable. You deploy it in the fra1
region together with the Droplets. The next two sections specify the target and monitoring ports and protocols.
The highlighted droplet_ids
block takes in the IDs of the Droplets, which should be managed by the Load Balancer. Since there are multiple Droplets, and their count is not known in advance, you use a for
loop to traverse the collection of Droplets (digitalocean_droplet.droplets
) and take their IDs. You surround the for
loop with brackets ([]
) so that the resulting collection will be a list.
Save and close the file.
You’ve now defined the Droplet, Load Balancer, and variables for your module. You’ll need to define the provider requirements, specifying which providers the module uses, including their version and where they are located. Since Terraform 0.13
, modules must explicitly define the sources of non-Hashicorp maintained providers they use; this is because they do not inherit them from the parent project.
You’ll store the provider requirements in a file named provider.tf
. Create it for editing by running:
- nano provider.tf
Add the following lines to require the digitalocean
provider:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
Save and close the file when you’re done. The droplet-lb
module now requires the digitalocean
provider.
Modules also support outputs, which you can use to extract internal information about the state of their resources. You’ll define an output that exposes the IP address of the Load Balancer, and store it in a file named outputs.tf
. Create it for editing:
- nano outputs.tf
Add the following definition:
output "lb_ip" {
value = digitalocean_loadbalancer.www-lb.ip
}
This output retrieves the IP address of the Load Balancer. Save and close the file.
The droplet-lb
module is now functionally complete and ready for deployment. You’ll call it from the main code, which you’ll store in the root of the project. First, navigate to it by going upward through your file directory two times:
- cd ../..
Then, create and open for editing a file called main.tf
, in which you’ll use the module:
- nano main.tf
Add the following lines:
module "groups" {
source = "./modules/droplet-lb"
droplet_count = 3
group_name = "group1"
}
output "loadbalancer-ip" {
value = module.groups.lb_ip
}
In this declaration you invoke the droplet-lb
module located in the directory specified as source
. You configure the input it provides, droplet_count
and group_name
, which is set to group1
so you’ll later be able to discern between instances.
Since the Load Balancer IP output is defined in a module, it won’t automatically be shown when you apply the project. The solution to this is to create another output retrieving its value (loadbalancer_ip
).
Save and close the file when you’re done.
Initialize the module by running:
- terraform init
The output will look like this:
OutputInitializing modules...
- groups in modules/droplet-lb
Initializing the backend...
Initializing provider plugins...
- Finding digitalocean/digitalocean versions matching "~> 2.0"...
- Installing digitalocean/digitalocean v2.19.0...
- Installed digitalocean/digitalocean v2.19.0 (signed by a HashiCorp partner, key ID F82037E524B9C0E8)
...
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
You can try planning the project to see what actions Terraform would take by running:
- terraform plan -var "do_token=${DO_PAT}"
The output will be similar to this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.groups.digitalocean_droplet.droplets[0] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "group1-0"
...
}
# module.groups.digitalocean_droplet.droplets[1] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "group1-1"
...
}
# module.groups.digitalocean_droplet.droplets[2] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "group1-2"
...
}
# module.groups.digitalocean_loadbalancer.www-lb will be created
+ resource "digitalocean_loadbalancer" "www-lb" {
...
+ name = "lb-group1"
...
}
Plan: 4 to add, 0 to change, 0 to destroy.
...
This output details that Terraform would create three Droplets, named group1-0
, group1-1
, and group1-2
, and would also create a Load Balancer called group1-lb
, which will manage the traffic to and from the three Droplets.
You can try applying the project to the cloud by running:
- terraform apply -var "do_token=${DO_PAT}"
Enter yes
when prompted. The output will show all the actions and the IP address of the Load Balancer will also be shown:
Outputmodule.groups.digitalocean_droplet.droplets[1]: Creating...
module.groups.digitalocean_droplet.droplets[0]: Creating...
module.groups.digitalocean_droplet.droplets[2]: Creating...
...
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Outputs:
loadbalancer-ip = ip_address
You’ve created a module containing a customizable number of Droplets and a Load Balancer that will automatically be configured to manage their ingoing and outgoing traffic.
In the previous section, you deployed the module you defined and called it groups
. If you ever wish to change its name, simply renaming the module call will not yield the expected results. Renaming the call will prompt Terraform to destroy and recreate resources, causing excessive downtime.
For example, open main.tf
for editing by running:
- nano main.tf
Rename the groups
module to groups_renamed
, as highlighted:
module "groups_renamed" {
source = "./modules/droplet-lb"
droplet_count = 3
group_name = "group1"
}
output "loadbalancer-ip" {
value = module.groups_renamed.lb_ip
}
Save and close the file. Then, initialize the project again:
- terraform init
You can now plan the project:
- terraform plan -var "do_token=${DO_PAT}"
The output will be long, but will look similar to this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
- destroy
Terraform will perform the following actions:
# module.groups.digitalocean_droplet.droplets[0] will be destroyed
...
# module.groups_renamed.digitalocean_droplet.droplets[0] will be created
...
Terraform will prompt you to destroy the existing instances and create new ones. This is destructive and unnecessary, and may lead to unwanted downtime.
Instead, using the moved
block, you can instruct Terraform to move old resources under the new name. Open main.tf
for editing and add the following lines to the end of the file:
moved {
from = module.groups
to = module.groups_renamed
}
When you’re done, save and close the file.
You can now plan the project:
- terraform plan -var "do_token=${DO_PAT}"
When you plan with the moved
block present in main.tf
, Terraform wants to move the resources, instead of recreate them:
OutputTerraform will perform the following actions:
# module.groups.digitalocean_droplet.droplets[0] has moved to module.groups_renamed.digitalocean_droplet.droplets[0]
...
# module.groups.digitalocean_droplet.droplets[1] has moved to module.groups_renamed.digitalocean_droplet.droplets[1]
...
Moving resources changes their place in Terraform state, meaning that the actual cloud resources won’t be modified, destroyed, or recreated.
Because you’ll modify the configuration significantly in the next step, destroy the deployed resources by running:
- terraform destroy -var "do_token=${DO_PAT}"
Enter yes
when prompted. The output will end in:
Output...
Destroy complete! Resources: 4 destroyed.
In this section, you renamed resources in your Terraform project without destroying them in the process. You’ll now deploy multiple instances of a module from the same code using for_each
and count
.
In this section, you’ll use count
and for_each
to deploy the droplet-lb
module multiple times with customizations.
count
One way to deploy multiple instances of the same module at once is to pass in how many to the count
parameter, which is automatically available to every module. Open main.tf
for editing:
- nano main.tf
Modify it to look like this, removing the existing output definition and moved
block:
module "groups" {
source = "./modules/droplet-lb"
count = 3
droplet_count = 3
group_name = "group1-${count.index}"
}
By setting count
to 3
, you instruct Terraform to deploy the module three times, each with a different group name. When you’re done, save and close the file.
Plan the deployment by running:
- terraform plan -var "do_token=${DO_PAT}"
The output will be long, and will look like this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.groups[0].digitalocean_droplet.droplets[0] will be created
...
# module.groups[0].digitalocean_droplet.droplets[1] will be created
...
# module.groups[0].digitalocean_droplet.droplets[2] will be created
...
# module.groups[0].digitalocean_loadbalancer.www-lb will be created
...
# module.groups[1].digitalocean_droplet.droplets[0] will be created
...
# module.groups[1].digitalocean_droplet.droplets[1] will be created
...
# module.groups[1].digitalocean_droplet.droplets[2] will be created
...
# module.groups[1].digitalocean_loadbalancer.www-lb will be created
...
# module.groups[2].digitalocean_droplet.droplets[0] will be created
...
# module.groups[2].digitalocean_droplet.droplets[1] will be created
...
# module.groups[2].digitalocean_droplet.droplets[2] will be created
...
# module.groups[2].digitalocean_loadbalancer.www-lb will be created
...
Plan: 12 to add, 0 to change, 0 to destroy.
...
Terraform details in the output that each of the three module instances would have three Droplets and a Load Balancer associated with them.
for_each
You can use for_each
for modules when you require more complex instance customization, or when the number of instances depends on third-party data (often presented as maps) that is not known while writing the code.
You’ll now define a map that pairs group names to Droplet counts and deploy instances of droplet-lb
according to it. Open main.tf
for editing by running:
- nano main.tf
Modify the file to make it look like this:
variable "group_counts" {
type = map
default = {
"group1" = 1
"group2" = 3
}
}
module "groups" {
source = "./modules/droplet-lb"
for_each = var.group_counts
droplet_count = each.value
group_name = each.key
}
You first define a map called group_counts
that contains how many Droplets a given group should have. Then, you invoke the module droplet-lb
, but specify that the for_each
loop should operate on var.group_counts
, the map you’ve defined just before. droplet_count
takes each.value
, the value of the current pair, which is the count of Droplets for the current group. group_name
receives the name of the group.
Save and close the file when you’re done.
Try applying the configuration by running:
- terraform plan -var "do_token=${DO_PAT}"
The output will detail the actions Terraform would take to create the two groups with their Droplets and Load Balancers:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.groups["group1"].digitalocean_droplet.droplets[0] will be created
...
# module.groups["group1"].digitalocean_loadbalancer.www-lb will be created
...
# module.groups["group2"].digitalocean_droplet.droplets[0] will be created
...
# module.groups["group2"].digitalocean_droplet.droplets[1] will be created
...
# module.groups["group2"].digitalocean_droplet.droplets[2] will be created
...
# module.groups["group2"].digitalocean_loadbalancer.www-lb will be created
...
In this step, you’ve used count
and for_each
to deploy multiple customized instances of the same module from the same code.
In this tutorial, you created and deployed Terraform modules. You used modules to group logically linked resources together and customized them in order to deploy multiple different instances from a central code definition. You also used outputs to show attributes of resources contained in the module.
If you would like to learn more about Terraform, check out our How To Manage Infrastructure with Terraform series.
]]>However I’d like to be able to, but can’t connect to any of my other droplets in the same VPC via there private network IPS.
Curious if there’s additional configuration I need to do to accomplish this?
I am able to access the droplets via there external IPs.
I’d prefer though to use their private ips so as to not have the network traffic between the droplets and VPN droplet count against our quota.
Any help would be appreciated.
]]>When I ran it terraform apply -var “do_token=${DO_PAT}” -var “pvt_key=/home/ubuntu/terraform-ansible/aws_rsa” -var “pub_key=~/.ssh/aws_rsa.pub”
I get an error
╷ │ Error: no ssh key found with name terraform │ │ with data.digitalocean_ssh_key.terraform, │ on provider.tf line 18, in data “digitalocean_ssh_key” “terraform”: │ 18: data “digitalocean_ssh_key” “terraform” { │ ╵
Not sure what I did wrong
]]>Terraform offers advanced features that become increasingly useful as your project grows in size and complexity. It’s possible to alleviate the cost of maintaining complex infrastructure definitions for multiple environments by structuring your code to minimize repetitions and introducing tool-assisted workflows for easier testing and deployment.
Terraform associates a state with a backend, which determines where and how state is stored and retrieved. Every state has only one backend and is tied to an infrastructure configuration. Certain backends, such as local
or s3
, may contain multiple states. In that case, the pairing of state and infrastructure to the backend is describing a workspace. Workspaces allow you to deploy multiple distinct instances of the same infrastructure configuration without storing them in separate backends.
In this tutorial, you’ll first deploy multiple infrastructure instances using different workspaces. You’ll then deploy a stateful resource, which, in this tutorial, will be a DigitalOcean Volume. Finally, you’ll reference pre-made modules from the Terraform Registry, which you can use to supplement your own.
To complete this tutorial, you’ll need:
terraform-advanced
, instead of loadbalance
. During Step 2, do not include the pvt_key
variable and the SSH key resource.Note: This tutorial has specifically been tested with Terraform 1.0.2
.
Multiple workspaces are useful when you want to deploy or test a modified version of your main infrastructure without creating a separate project and setting up authentication keys again. Once you have developed and tested a feature using the separate state, you can incorporate the new code into the main workspace and possibly delete the additional state. When you init
a Terraform project, regardless of backend, Terraform creates a workspace called default
. It is always present and you can never delete it.
However, multiple workspaces are not a suitable solution for creating multiple environments, such as for staging and production. Therefore workspaces, which only track the state, do not store the code or its modifications.
Since workspaces do not track the actual code, you should manage the code separation between multiple workspaces at the version control (VCS) level by matching them to their infrastructure variants. How you can achieve this is dependent on the VCS tool itself; for example, in Git branches would be a fitting abstraction. To make it easier to manage the code for multiple environments, you can break them up into reusable modules, so that you avoid repeating similar code for each environment.
You’ll now create a project that deploys a Droplet, which you’ll apply from multiple workspaces.
You’ll store the Droplet definition in a file called droplets.tf
.
Assuming you’re in the terraform-advanced
directory, create and open it for editing by running:
- nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "web" {
image = "ubuntu-18-04-x64"
name = "web-${terraform.workspace}"
region = "fra1"
size = "s-1vcpu-1gb"
}
This definition will create a Droplet running Ubuntu 18.04 with one CPU core and 1 GB RAM in the fra1
region. Its name will contain the name of the current workspace it is deployed from. When you’re done, save and close the file.
Apply the project for Terraform to run its actions with:
- terraform apply -var "do_token=${DO_PAT}"
The output will look similar to this:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.web will be created
+ resource "digitalocean_droplet" "web" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-18-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ ipv6_address_private = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "web-default"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "fra1"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ status = (known after apply)
+ urn = (known after apply)
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
...
Enter yes
when prompted to deploy the Droplet in the default
workspace.
The name of the Droplet will be web-default
, because the workspace you start with is called default
. You can list the workspaces to confirm that it’s the only one available:
- terraform workspace list
The output will look similar to this:
Output* default
The asterisk (*
) means that you currently have that workspace selected.
Create and switch to a new workspace called testing
, which you’ll use to deploy a different Droplet, by running workspace new
:
- terraform workspace new testing
The output will look similar to this:
OutputCreated and switched to workspace "testing"!
You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.
You plan the deployment of the Droplet again by running:
- terraform plan -var "do_token=${DO_PAT}"
The output will be similar to the previous run:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.web will be created
+ resource "digitalocean_droplet" "web" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-18-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ ipv6_address_private = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "web-testing"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "fra1"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ status = (known after apply)
+ urn = (known after apply)
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
...
Notice that Terraform plans to deploy a Droplet called web-testing
, which it has named differently from web-default
. This is because the default
and testing
workspaces have separate states and have no knowledge of each other’s resources—even though they stem from the same code.
To confirm that you’re in the testing
workspace, output the current one you’re in with workspace show
:
- terraform workspace show
The output will be the name of the current workspace:
Outputtesting
To delete a workspace, you first need to destroy all its deployed resources. Then, if it’s active, you need to switch to another one using workspace select
. Since the testing
workspace here is empty, you can switch to default
right away:
- terraform workspace select default
You’ll receive output of Terraform confirming the switch:
OutputSwitched to workspace "default".
You can then delete it by running workspace delete
:
- terraform workspace delete testing
Terraform will then perform the deletion:
OutputDeleted workspace "testing"!
You can destroy the Droplet you’ve deployed in the default
workspace by running:
- terraform destroy -var "do_token=${DO_PAT}"
Enter yes
when prompted to finish the process.
In this section, you’ve worked in multiple Terraform workspaces. In the next section, you’ll deploy a stateful resource.
Stateless resources do not store data, so you can create and replace them quickly, because they are not unique. Stateful resources, on the other hand, contain data that is unique or not simply re-creatable; therefore, they require persistent data storage.
Since you may end up destroying such resources, or multiple resources require their data, it’s best to store it in a separate entity, such as DigitalOcean Volumes.
Volumes provide additional storage space. They can be attached to Droplets (servers), but are separate from them. In this step, you’ll define the Volume and connect it to a Droplet in droplets.tf
.
Open it for editing:
- nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "web" {
image = "ubuntu-18-04-x64"
name = "web-${terraform.workspace}"
region = "fra1"
size = "s-1vcpu-1gb"
}
resource "digitalocean_volume" "volume" {
region = "fra1"
name = "new-volume"
size = 10
initial_filesystem_type = "ext4"
description = "New Volume for Droplet"
}
resource "digitalocean_volume_attachment" "volume_attachment" {
droplet_id = digitalocean_droplet.web.id
volume_id = digitalocean_volume.volume.id
}
Here you define two new resources, the Volume itself and a Volume attachment. The Volume will be 10GB, formatted as ext4
, called new-volume
, and located in the same region as the Droplet. Since the Volume and the Droplet are separate entities, you’ll need to define a Volume attachment object to connect them. volume_attachment
takes the Droplet and Volume IDs and instructs the DigitalOcean cloud to make the Volume available to the Droplet as a disk device.
When you’re done, save and close the file.
Plan this configuration by running:
- terraform plan -var "do_token=${DO_PAT}"
The actions that Terraform will plan will be the following:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.web will be created
+ resource "digitalocean_droplet" "web" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-18-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ ipv6_address_private = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "web-default"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "fra1"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ status = (known after apply)
+ urn = (known after apply)
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
# digitalocean_volume.volume will be created
+ resource "digitalocean_volume" "volume" {
+ description = "New Volume for Droplet"
+ droplet_ids = (known after apply)
+ filesystem_label = (known after apply)
+ filesystem_type = (known after apply)
+ id = (known after apply)
+ initial_filesystem_type = "ext4"
+ name = "new-volume"
+ region = "fra1"
+ size = 10
+ urn = (known after apply)
}
# digitalocean_volume_attachment.volume_attachment will be created
+ resource "digitalocean_volume_attachment" "volume_attachment" {
+ droplet_id = (known after apply)
+ id = (known after apply)
+ volume_id = (known after apply)
}
Plan: 3 to add, 0 to change, 0 to destroy.
...
The output details that Terraform would create a Droplet, a Volume, and a Volume attachment, which connects the Volume to the Droplet.
You’ve now defined and connected a Volume (a stateful resource) to a Droplet. In the next section, you’ll review public, pre-made Terraform modules that you can incorporate in your project.
Aside from creating your own custom modules for your projects, you can also use pre-made modules and providers from other developers, which are publicly available at Terraform Registry.
In the modules section you can search the database of available modules and sort by provider in order to find the module with the functionality you need. Once you’ve found one, you can read its description, which lists the inputs and outputs the module provides, as well as its external module and provider dependencies.
You’ll now add the DigitalOcean SSH key module to your project. You’ll store the code separate from existing definitions in a file called ssh-key.tf
. Create and open it for editing by running:
- nano ssh-key.tf
Add the following lines:
module "ssh-key" {
source = "clouddrove/ssh-key/digitalocean"
key_path = "~/.ssh/id_rsa.pub"
key_name = "new-ssh-key"
enable_ssh_key = true
}
This code defines an instance of the clouddrove/droplet/digitalocean
module from the registry and sets some of the parameters it offers. It should add a public SSH key to your account by reading it from ~/.ssh/id_rsa.pub
.
When you’re done, save and close the file.
Before you plan
this code, you must download the referenced module by running:
- terraform init
You’ll receive output similar to the following:
OutputInitializing modules...
Downloading clouddrove/ssh-key/digitalocean 0.13.0 for ssh-key...
- ssh-key in .terraform/modules/ssh-key
Initializing the backend...
Initializing provider plugins...
- Reusing previous version of digitalocean/digitalocean from the dependency lock file
- Using previously-installed digitalocean/digitalocean v2.10.1
Terraform has been successfully initialized!
...
You can now plan the code for the changes:
- terraform plan -var "do_token=${DO_PAT}"
You’ll receive output similar to this:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
...
# module.ssh-key.digitalocean_ssh_key.default[0] will be created
+ resource "digitalocean_ssh_key" "default" {
+ fingerprint = (known after apply)
+ id = (known after apply)
+ name = "devops"
+ public_key = "ssh-rsa ... demo@clouddrove"
}
Plan: 4 to add, 0 to change, 0 to destroy.
...
The output shows that you would create the SSH key resource, which means that you downloaded and invoked the module from your code.
Bigger projects can make use of some advanced features Terraform offers to help reduce complexity and make maintenance easier. Workspaces allow you to test new additions to your code without touching the stable main deployments. You can also couple workspaces with a version control system to track code changes. Using pre-made modules can also shorten development time, but may incur additional expenses or time in the future if the module becomes obsolete.
This tutorial is part of the How To Manage Infrastructure with Terraform series. The series covers a number of Terraform topics, from installing Terraform for the first time to managing complex projects.
]]>Ansible is a configuration management tool that executes playbooks, which are lists of customizable actions written in YAML on specified target servers. It can perform all bootstrapping operations, like installing and updating software, creating and removing users, and configuring system services. As such, it is suitable for bringing up servers you deploy using Terraform, which are created blank by default.
Ansible and Terraform are not competing solutions, because they resolve different phases of infrastructure and software deployment. Terraform allows you to define and create the infrastructure of your system, encompassing the hardware that your applications will run on. Conversely, Ansible configures and deploys software by executing its playbooks on the provided server instances. Running Ansible on the resources Terraform provisioned directly after their creation allows you to make the resources usable for your use case much faster. It also enables easier maintenance and troubleshooting, because all deployed servers will have the same actions applied to them.
In this tutorial, you’ll deploy Droplets using Terraform, and then immediately after their creation, you’ll bootstrap the Droplets using Ansible. You’ll invoke Ansible directly from Terraform when a resource deploys. You’ll also avoid introducing race conditions using Terraform’s remote-exec
and local-exec
provisioners in your configuration, which will ensure that the Droplet deployment is fully complete before further setup commences.
A DigitalOcean Personal Access Token, which you can create via the DigitalOcean Control Panel. You can find instructions in the DigitalOcean product documents, How to Create a Personal Access Token.
Terraform installed on your local machine and a project set up with the DigitalOcean provider. Complete Step 1 and Step 2 of the How To Use Terraform with DigitalOcean tutorial and be sure to name the project folder terraform-ansible
, instead of loadbalance
.
Ansible installed on your machine. For Ubuntu 20.04, complete the first step of the How to Install and Configure Ansible on Ubuntu 20.04 tutorial. To learn more about Ansible, read this Introduction to Configuration Management with Ansible article.
Note: This tutorial has specifically been tested with Terraform 1.0.2
.
In this step, you’ll define the Droplets on which you’ll later run an Ansible playbook, which will set up the Apache web server.
Assuming you are in the terraform-ansible
directory, which you created as part of the prerequisites, you’ll define a Droplet resource, create three copies of it by specifying count
, and output their IP addresses. You’ll store the definitions in a file named droplets.tf
. Create and open it for editing by running:
- nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "web" {
count = 3
image = "ubuntu-18-04-x64"
name = "web-${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
ssh_keys = [
data.digitalocean_ssh_key.terraform.id
]
}
output "droplet_ip_addresses" {
value = {
for droplet in digitalocean_droplet.web:
droplet.name => droplet.ipv4_address
}
}
Here you define a Droplet resource running Ubuntu 18.04 with 1GB RAM on a CPU core in the region fra1
. Terraform will pull the SSH key you defined in the prerequisites from your account and add it to the provisioned Droplet with the specified unique ID list element passed into ssh_keys
. Terraform will deploy the Droplet three times because the count
parameter is set. The output block following it will show the IP addresses of the three Droplets. The loop traverses the list of Droplets, and for each instance, pairs its name with its IP address and appends it to the resulting map.
Save and close the file when you’re done.
You have now defined the Droplets that Terraform will deploy. In the next step, you’ll write an Ansible playbook that will execute on each of the three deployed Droplets and will deploy the Apache web server. You’ll later go back to the Terraform code and add in the integration with Ansible.
You’ll now create an Ansible playbook that performs the initial server setup tasks, such as creating a new user and upgrading the installed packages. You’ll instruct Ansible on what to do by writing tasks, which are units of action that are executed on target hosts. Tasks can use built-in functions, or specify custom commands to be run. Besides the tasks for the initial setup, you’ll also install the Apache web server and enable its mod_rewrite
module.
Before writing the playbook, ensure that your public and private SSH keys, which correspond to the one in your DigitalOcean account, are available and accessible on the machine from which you’re running Terraform and Ansible. A typical location for storing them on Linux would be ~/.ssh
(although you can store them in other places).
Note: On Linux, you’ll need to ensure that the private key file has appropriate permissions. You can set them by running:
- chmod 600 your_private_key_location
You already have a variable for the private key defined, so you’ll only need to add one for the public key location.
Open provider.tf
for editing by running:
- nano provider.tf
Add the following line:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
variable "do_token" {}
variable "pvt_key" {}
variable "pub_key" {}
provider "digitalocean" {
token = var.do_token
}
data "digitalocean_ssh_key" "terraform" {
name = "terraform"
}
When you’re done, save and close the file.
With the pub_key
variable now defined, you’ll start writing the Ansible playbook. You’ll store it in a file called apache-install.yml
. Create and open it for editing:
- nano apache-install.yml
You’ll be building the playbook gradually. First, you’ll need to define on which hosts the playbook will run, its name, and if the tasks should be run as root. Add the following lines:
- become: yes
hosts: all
name: apache-install
By setting become
to yes
, you instruct Ansible to run commands as the superuser, and by specifying all
for hosts
, you allow Ansible to run the tasks on any given server—even the ones passed in through the command line, as Terraform does.
The first task that you’ll add will create a new, non-root user. Append the following task definition to your playbook:
tasks:
- name: Add the user 'sammy' and add it to 'sudo'
user:
name: sammy
group: sudo
You first define a list of tasks and then add a task to it. It will create a user named sammy and grant them superuser access using sudo
by adding them to the appropriate group.
The next task will add your public SSH key to the user, so you’ll be able to connect to it later on:
- name: Add SSH key to 'sammy'
authorized_key:
user: sammy
state: present
key: "{{ lookup('file', pub_key) }}"
This task will ensure that the public SSH key, which is looked up from a local file, is present
on the target. You’ll supply the value for the pub_key
variable from Terraform in the next step.
You can now order the installation of Apache and the mod_rewrite
module by appending the following tasks:
- name: Wait for apt to unlock
become: yes
shell: while sudo fuser /var/lib/dpkg/lock >/dev/null 2>&1; do sleep 5; done;
- name: Install apache2
apt:
name: apache2
update_cache: yes
state: latest
- name: Enable mod_rewrite
apache2_module:
name: rewrite
state: present
notify:
- Restart apache2
handlers:
- name: Restart apache2
service:
name: apache2
state: restarted
The first task will wait until any previous package installation using the apt
package manager is complete. The second task will run apt
to install Apache. Then, the third one will ensure that the mod_rewrite
module is present
. After it’s enabled, you need to ensure that you restart Apache, which you can’t configure from the task itself. To resolve that, you call a handler to issue the restart.
At this point, your playbook will look like the following:
- become: yes
hosts: all
name: apache-install
tasks:
- name: Add the user 'sammy' and add it to 'sudo'
user:
name: sammy
group: sudo
- name: Add SSH key to 'sammy'
authorized_key:
user: sammy
state: present
key: "{{ lookup('file', pub_key) }}"
- name: Wait for apt to unlock
become: yes
shell: while sudo fuser /var/lib/dpkg/lock >/dev/null 2>&1; do sleep 5; done;
- name: Install apache2
apt:
name: apache2
update_cache: yes
state: latest
- name: Enable mod_rewrite
apache2_module:
name: rewrite
state: present
notify:
- Restart apache2
handlers:
- name: Restart apache2
service:
name: apache2
state: restarted
When you’re done, check that indentations of all YAML elements are correct and match the ones shown above. This is all you need to define on the Ansible side, so save and close the playbook. You’ll now modify the Droplet deployment code to execute this playbook when the Droplets have finished provisioning.
Now that you have defined the actions Ansible will take on the target servers, you’ll modify the Terraform configuration to run it upon Droplet creation.
Terraform offers two provisioners that execute commands: local-exec
and remote-exec
, which run commands locally or remotely (on the target), respectively. remote-exec
requires connection data, such as type and access keys, while local-exec
does everything on the machine Terraform is executing on, and so does not require connection information. It’s important to note that local-exec
runs immediately after the resource you have defined it for has finished provisioning; therefore, it does not wait for the resource to actually boot up. It runs after the cloud platform acknowledges its presence in the system.
You’ll now add provisioner definitions to your Droplet to run Ansible after deployment. Open droplets.tf
for editing:
- nano droplets.tf
Add the highlighted lines:
resource "digitalocean_droplet" "web" {
count = 3
image = "ubuntu-18-04-x64"
name = "web-${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
ssh_keys = [
data.digitalocean_ssh_key.terraform.id
]
provisioner "remote-exec" {
inline = ["sudo apt update", "sudo apt install python3 -y", "echo Done!"]
connection {
host = self.ipv4_address
type = "ssh"
user = "root"
private_key = file(var.pvt_key)
}
}
provisioner "local-exec" {
command = "ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -u root -i '${self.ipv4_address},' --private-key ${var.pvt_key} -e 'pub_key=${var.pub_key}' apache-install.yml"
}
}
output "droplet_ip_addresses" {
value = {
for droplet in digitalocean_droplet.web:
droplet.name => droplet.ipv4_address
}
}
Like Terraform, Ansible runs locally and connects to the target servers via SSH. To run it, you define a local-exec
provisioner in the Droplet definition that runs the ansible-playbook
command. This passes in the username (root), the IP of the current Droplet (retrieved with ${self.ipv4_address}
), the SSH public and private keys, and specifies the playbook file to run (apache-install.yml
). By setting the ANSIBLE_HOST_KEY_CHECKING
environment variable to False
, you skip checking if the server was connected to beforehand.
As was noted, the local-exec
provisioner runs without waiting for the Droplet to become available, so the execution of the playbook may precede the actual availability of the Droplet. To remedy this, you define the remote-exec
provisioner to contain commands to execute on the target server. For remote-exec
to execute, the target server must be available. Since remote-exec
runs before local-exec
, the server will be fully initialized by the time Ansible is invoked. python3
comes preinstalled on Ubuntu 18.04, so you can comment out or remove the command as necessary.
When you’re done making changes, save and close the file.
Then, deploy the Droplets by running the following command. Remember to replace private_key_location
and public_key_location
with the locations of your private and public keys respectively:
- terraform apply -var "do_token=${DO_PAT}" -var "pvt_key=private_key_location" -var "pub_key=public_key_location"
The output will be long. Your Droplets will provision and then a connection will establish with each. Next the remote-exec
provisioner will execute and install python3
:
Output...
digitalocean_droplet.web[1] (remote-exec): Connecting to remote host via SSH...
digitalocean_droplet.web[1] (remote-exec): Host: ...
digitalocean_droplet.web[1] (remote-exec): User: root
digitalocean_droplet.web[1] (remote-exec): Password: false
digitalocean_droplet.web[1] (remote-exec): Private key: true
digitalocean_droplet.web[1] (remote-exec): Certificate: false
digitalocean_droplet.web[1] (remote-exec): SSH Agent: false
digitalocean_droplet.web[1] (remote-exec): Checking Host Key: false
digitalocean_droplet.web[1] (remote-exec): Connected!
...
After that, Terraform will run the local-exec
provisioner for each of the Droplets, which executes Ansible. The following output shows this for one of the Droplets:
Output...
digitalocean_droplet.web[2] (local-exec): Executing: ["/bin/sh" "-c" "ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -u root -i 'ip_address,' --private-key private_key_location -e 'pub_key=public_key_location' apache-install.yml"]
digitalocean_droplet.web[2] (local-exec): PLAY [apache-install] **********************************************************
digitalocean_droplet.web[2] (local-exec): TASK [Gathering Facts] *********************************************************
digitalocean_droplet.web[2] (local-exec): ok: [ip_address]
digitalocean_droplet.web[2] (local-exec): TASK [Add the user 'sammy' and add it to 'sudo'] *******************************
digitalocean_droplet.web[2] (local-exec): changed: [ip_address]
digitalocean_droplet.web[2] (local-exec): TASK [Add SSH key to 'sammy''] *******************************
digitalocean_droplet.web[2] (local-exec): changed: [ip_address]
digitalocean_droplet.web[2] (local-exec): TASK [Update all packages] *****************************************************
digitalocean_droplet.web[2] (local-exec): changed: [ip_address]
digitalocean_droplet.web[2] (local-exec): TASK [Install apache2] *********************************************************
digitalocean_droplet.web[2] (local-exec): changed: [ip_address]
digitalocean_droplet.web[2] (local-exec): TASK [Enable mod_rewrite] ******************************************************
digitalocean_droplet.web[2] (local-exec): changed: [ip_address]
digitalocean_droplet.web[2] (local-exec): RUNNING HANDLER [Restart apache2] **********************************************
digitalocean_droplet.web[2] (local-exec): changed: [ip_address]
digitalocean_droplet.web[2] (local-exec): PLAY RECAP *********************************************************************
digitalocean_droplet.web[2] (local-exec): [ip_address] : ok=7 changed=6 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
...
At the end of the output, you’ll receive a list of the three Droplets and their IP addresses:
Outputdroplet_ip_addresses = {
"web-0" = "..."
"web-1" = "..."
"web-2" = "..."
}
You can now navigate to one of the IP addresses in your browser. You will reach the default Apache welcome page, signifying the successful installation of the web server.
This means that Terraform provisioned your servers and your Ansible playbook executed on it successfully.
To check that the SSH key was correctly added to sammy on the provisioned Droplets, connect to one of them with the following command:
- ssh -i private_key_location sammy@droplet_ip_address
Remember to put in the private key location and the IP address of one of the provisioned Droplets, which you can find in your Terraform output.
The output will look similar to the following:
OutputWelcome to Ubuntu 18.04.5 LTS (GNU/Linux 4.15.0-121-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of ...
System load: 0.0 Processes: 88
Usage of /: 6.4% of 24.06GB Users logged in: 0
Memory usage: 20% IP address for eth0: ip_address
Swap usage: 0% IP address for eth1: ip_address
0 packages can be updated.
0 updates are security updates.
New release '20.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.
*** System restart required ***
Last login: ...
...
You’ve successfully connected to the target and obtained shell access for the sammy user, which confirms that the SSH key was correctly configured for that user.
You can destroy the deployed Droplets by running the following command, entering yes
when prompted:
- terraform destroy -var "do_token=${DO_PAT}" -var "pvt_key=private_key_location" -var "pub_key=public_key_location"
In this step, you have added in Ansible playbook execution as a local-exec
provisioner to your Droplet definition. To ensure that the server is available for connections, you’ve included the remote-exec
provisioner, which can serve to install the python3
prerequisite, after which Ansible will run.
Terraform and Ansible together form a flexible workflow for spinning up servers with the needed software and hardware configurations. Running Ansible directly as part of the Terraform deployment process allows you to have the servers up and bootstrapped with dependencies for your development work and applications much faster.
This tutorial is part of the How To Manage Infrastructure with Terraform series. The series covers a number of Terraform topics, from installing Terraform for the first time to managing complex projects.
You can also find additional Ansible content resources on our Ansible topic page.
]]>Terraform provides automation to provision your infrastructure in the cloud. To do this, Terraform authenticates with cloud providers (and other providers) to deploy the resources and perform the planned actions. However, the information Terraform needs for authentication is very valuable, and generally, is sensitive information that you should always keep secret since it unlocks access to your services. For example, you can consider API keys or passwords for database users as sensitive data.
If a malicious third party were to acquire the sensitive information, they would be able to breach the security systems by presenting themselves as a known trusted user. In turn, they would be able to modify, delete, and replace the resources and services that are available under the scope of the obtained keys. To prevent this from happening, it is essential to properly secure your project and safeguard its state file, which stores all the project secrets.
By default, Terraform stores the state file locally in the form of unencrypted JSON, allowing anyone with access to the project files to read the secrets. While a solution to this is to restrict access to the files on disk, another option is to store the state remotely in a backend that encrypts the data automatically, such as DigitalOcean Spaces.
In this tutorial, you’ll hide sensitive data in outputs during execution and store your state in a secure cloud object storage, which encrypts data at rest. You’ll use DigitalOcean Spaces in this tutorial as your cloud object storage. You’ll also learn how to mark variables as sensitive, as well as explore tfmask, which is an open source program written in Go that dynamically censors values in the Terraform execution log output.
terraform-sensitive
, instead of loadbalance
. During Step 2, do not include the pvt_key
variable and the SSH key resource.Note: This tutorial has specifically been tested with Terraform 1.0.2
.
sensitive
In this step, you’ll hide outputs in code by setting their sensitive
parameter to true
. This is useful when secret values are part of the Terraform output that you’re storing indefinitely, or if you need to share the output logs beyond your team for analysis.
Assuming you are in the terraform-sensitive
directory, which you created as part of the prerequisites, you’ll define a Droplet and an output showing its IP address. You’ll store it in a file named droplets.tf
, so create and open it for editing by running:
- nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = "web-1"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_address" {
value = digitalocean_droplet.web.ipv4_address
}
This code will deploy a Droplet called web-1
in the fra1
region, running Ubuntu 20.04 on 1GB RAM and one CPU core. Here you’ve given the droplet_ip_address
output a value and you’ll receive this in the Terraform log.
To deploy this Droplet, execute the code by running the following command:
- terraform apply -var "do_token=${DO_PAT}"
The actions Terraform will take will be the following:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.web will be created
+ resource "digitalocean_droplet" "web" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ ipv6_address_private = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "web-1"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "fra1"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ status = (known after apply)
+ urn = (known after apply)
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
...
Enter yes
when prompted. The output will look similar to this:
Outputdigitalocean_droplet.web: Creating...
...
digitalocean_droplet.web: Creation complete after 40s [id=216255733]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Outputs:
droplet_ip_address = your_droplet_ip_address
You will find that the IP address is in the output. If you’re sharing this output with others, or in case it will be publicly available because of automated deployment processes, it’s important to take actions to hide this data in the output.
To censor it, you’ll need to set the sensitive
attribute of the droplet_ip_address
output to true
.
Open droplets.tf
for editing:
- nano droplets.tf
Add the highlighted line:
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = "web-1"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_address" {
value = digitalocean_droplet.web.ipv4_address
sensitive = true
}
Save and close the file when you’re done.
Apply the project again by running:
- terraform apply -var "do_token=${DO_PAT}"
The output will be:
Outputdigitalocean_droplet.web: Refreshing state... [id=216255733]
...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
droplet_ip_address = <sensitive>
You’ve now explicitly censored the IP address—the value of the output. Censoring outputs is useful in situations when the Terraform logs would be in a public space, or when you want them to remain hidden, but not delete them from the code. You’ll also want to censor outputs that contain passwords and API tokens, as they are sensitive information as well.
You’ve now hidden the values of the defined outputs by marking them as sensitive
. You’ll now see how to mark variables as sensitive.
sensitive
Similar to outputs, variables can also be marked as sensitive. Since you have only one variable defined (do_token
), open provider.tf
for editing:
- nano provider.tf
Modify the do_token
variable to look like this:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
variable "do_token" {
sensitive = true
}
provider "digitalocean" {
token = var.do_token
}
When you’re done, save and close the file. The do_token
variable is now considered sensitive.
To try outputting a sensitive variable, you’ll define a new output in droplets.tf
:
- nano droplets.tf
Add the following lines at the end:
output "dotoken" {
value = var.do_token
}
Save and close the file. Then, try applying the configuration by running:
- terraform apply -var "do_token=${DO_PAT}"
You’ll receive an error message similar to this:
Output╷
│ Error: Output refers to sensitive values
│
│ on droplets.tf line 13:
│ 13: output "dotoken" {
│
│ To reduce the risk of accidentally exporting sensitive data that was intended to be only internal, Terraform requires
│ that any root module output containing sensitive data be explicitly marked as sensitive, to confirm your intent.
│
│ If you do intend to export this data, annotate the output value as sensitive by adding the following argument:
│ sensitive = true
╵
This error means that sensitive variables can not be shown in nonsensitive outputs to prevent information leakage. You can, however, force them to be shown by wrapping the output value as nonsensitive
, like so:
...
output "dotoken" {
value = nonsensitive(var.do_token)
}
nonsensitive
resets the sensitivity preference of the variable, allowing it to be shown. This should be used sparingly, and only when the output is a non-reversible derivative of the sensitive variable.
You’ve now seen how to mark variables as sensitive, and how to override that preference. In the next step, you’ll configure Terraform to store your project’s state in the encrypted cloud, instead of locally.
The state file stores all information about your deployed infrastructure, including all its internal relationships and secrets. By default, it’s stored in plaintext, locally on the disk. Storing it remotely in the cloud provides a higher level of security. If the cloud storage service supports encryption at rest, it will store the state file in an encrypted state at all times, so that potential attackers won’t be able to gather information from it. Storing the state file encrypted remotely is different from marking outputs as sensitive
—this way, all secrets are securely stored in the cloud, which only changes how Terraform stores data, not when it’s displayed.
You’ll now configure your project to store the state file in a DigitalOcean Space. As a result, it will be encrypted at rest and protected with TLS in transit.
By default, the Terraform state file is called terraform.tfstate
and is located in the root of every initialized directory. You can view its contents by running:
- cat terraform.tfstate
The contents of the file will be similar to this:
{
"version": 4,
"terraform_version": "1.0.2",
"serial": 3,
"lineage": "16362bdb-2ff3-8ac7-49cc-260f3261d8eb",
"outputs": {
"droplet_ip_address": {
"value": "...",
"type": "string",
"sensitive": true
}
},
"resources": [
{
"mode": "managed",
"type": "digitalocean_droplet",
"name": "web",
"provider": "provider[\"registry.terraform.io/digitalocean/digitalocean\"]",
"instances": [
{
"schema_version": 1,
"attributes": {
"backups": false,
"created_at": "2021-07-11T06:16:51Z",
"disk": 25,
"id": "254368889",
"image": "ubuntu-20-04-x64",
"ipv4_address": "...",
"ipv4_address_private": "10.135.0.3",
"ipv6": false,
"ipv6_address": "",
"locked": false,
"memory": 1024,
"monitoring": false,
"name": "web-1",
"price_hourly": 0.00744,
"price_monthly": 5,
"private_networking": true,
"region": "fra1",
"resize_disk": true,
"size": "s-1vcpu-1gb",
"ssh_keys": null,
"status": "active",
"tags": [],
"urn": "do:droplet:254368889",
"user_data": null,
"vcpus": 1,
"volume_ids": [],
"vpc_uuid": "fc52519c-dc84-11e8-8b13-3cfdfea9f160"
},
"sensitive_attributes": [],
"private": "..."
}
]
}
]
}
The state file contains all the resources you’ve deployed, as well as all outputs and their computed values. Gaining access to this file is enough to compromise the entire deployed infrastructure. To prevent that from happening, you can store it encrypted in the cloud.
Terraform supports multiple backends, which are storage and retrieval mechanisms for the state. Examples are: local
for local storage, pg
for the Postgres database, and s3
for S3 compatible storage, which you’ll use to connect to your Space.
The back-end configuration is specified under the main terraform
block, which is currently in provider.tf
. Open it for editing by running:
- nano provider.tf
Add the following lines:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
backend "s3" {
key = "state/terraform.tfstate"
bucket = "your_space_name"
region = "us-west-1"
endpoint = "https://spaces_endpoint"
skip_region_validation = true
skip_credentials_validation = true
skip_metadata_api_check = true
}
}
variable "do_token" {}
provider "digitalocean" {
token = var.do_token
}
The s3
back-end block first specifies the key
, which is the location of the Terraform state file on the Space. Passing in state/terraform.tfstate
means that you will store it as terraform.tfstate
under the state
directory.
The endpoint
parameter tells Terraform where the Space is located and bucket
defines the exact Space to connect to. The skip_region_validation
and skip_credentials_validation
disable validations that are not applicable to DigitalOcean Spaces. Note that region
must be set to a conforming value (such as us-west-1
), which has no reference to Spaces.
Remember to put in your bucket name and the Spaces endpoint, including the region, which you can find in the Settings tab of your Space. Note that the do_token
variable is no longer marked as sensitive. When you are done customizing the endpoint
, save and close the file.
Next, put the access and secret keys for your Space in environment variables, so you’ll be able to reference them later. Run the following commands, replacing the highlighted placeholders with your key values:
- export SPACE_ACCESS_KEY="your_space_access_key"
- export SPACE_SECRET_KEY="your_space_secret_key"
Then, configure Terraform to use the Space as its backend by running:
- terraform init -backend-config "access_key=$SPACE_ACCESS_KEY" -backend-config "secret_key=$SPACE_SECRET_KEY"
The -backend-config
argument provides a way to set back-end parameters at runtime, which you are using here to set the Space keys. You’ll be asked if you wish to copy the existing state to the cloud, or start anew:
OutputInitializing the backend...
Do you want to copy existing state to the new backend?
Pre-existing state was found while migrating the previous "local" backend to the
newly configured "s3" backend. No existing state was found in the newly
configured "s3" backend. Do you want to copy this state to the new "s3"
backend? Enter "yes" to copy and "no" to start with an empty state.
Enter yes
when prompted. The rest of the output will look similar to the following:
OutputSuccessfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.
Initializing provider plugins...
- Reusing previous version of digitalocean/digitalocean from the dependency lock file
- Using previously-installed digitalocean/digitalocean v2.10.1
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
Your project will now store its state in your Space. If you receive an error, double-check that you’ve provided the correct keys, endpoint, and bucket name.
Your project is now storing state in your Space. The local state file has been emptied, which you can check by showing its contents:
- cat terraform.tfstate
There will be no output, as expected.
You can try modifying the Droplet definition and applying it to check that the state is still being correctly managed.
Open droplets.tf
for editing:
- nano droplets.tf
Modify the highlighted lines:
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = "test-droplet"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_address" {
value = digitalocean_droplet.web.ipv4_address
sensitive = false
}
You can remove the dotoken
output from before. Save and close the file, then apply the project by running:
- terraform apply -var "do_token=${DO_PAT}"
The output will look similar to the following:
OutputTerraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# digitalocean_droplet.web will be updated in-place
~ resource "digitalocean_droplet" "web" {
id = "254368889"
~ name = "web-1" -> "test-droplet"
tags = []
# (21 unchanged attributes hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
...
Enter yes
when prompted, and Terraform will apply the new configuration to the existing Droplet, which means that it’s correctly communicating with the Space its state is stored on:
Output...
digitalocean_droplet.web: Modifying... [id=216419273]
digitalocean_droplet.web: Still modifying... [id=216419273, 10s elapsed]
digitalocean_droplet.web: Modifications complete after 12s [id=216419273]
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
Outputs:
droplet_ip_address = your_droplet_ip_address
You’ve configured the s3
backend for your project so that you’re storing the state encrypted in the cloud in a DigitalOcean Space. In the next step, you’ll use tfmask
, a tool that will dynamically censor all sensitive outputs and information in Terraform logs.
tfmask
in CI/CD EnvironmentsIn this section, you’ll download tfmask
and use it to dynamically censor sensitive data from the whole output log Terraform generates when executing a command. It will censor the variables and parameters whose values are matched by a RegEx expression that you provide.
Dynamically matching parameter and variable names is possible when they follow a pattern (for example, contain the word password
or secret
). The advantage of using tfmask
over marking the outputs as sensitive is that it also censors matched parts of the resource declarations that Terraform prints out while executing. It’s imperative you hide them when the execution logs may be public, such as in automated CI/CD environments, which may often list execution logs publicly.
Compiled binaries of tfmask
are available at its releases page on GitHub. For Linux, run the following command to download it:
- sudo curl -L https://github.com/cloudposse/tfmask/releases/download/0.7.0/tfmask_linux_amd64 -o /usr/bin/tfmask
Mark it as executable by running:
- sudo chmod +x /usr/bin/tfmask
tfmask
works on the outputs of terraform plan
and terraform apply
by masking the values of all variables whose names are matched by a RegEx expression that you specify. You will use the environment variables TFMASK_VALUES_REGEX
and TFMASK_CHAR
to supply the regex expression, as well as the character replacing the actual values.
You’ll now use tfmask
to censor the name
and ipv4_address
of the Droplet that Terraform would deploy. First, you’ll need to set the mentioned environment variables by running:
- export TFMASK_CHAR="*"
- export TFMASK_VALUES_REGEX="(?i)^.*(ipv4_address|name).*$"
This regex expression will match all strings starting with ipv4_address
or name
(as well as themselves), and will not be case sensitive.
To make Terraform plan an action for your Droplet, modify its definition:
- nano droplets.tf
Modify the Droplet’s name:
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = "web"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_address" {
value = digitalocean_droplet.web.ipv4_address
sensitive = false
}
Save and close the file.
Because you’ve changed an attribute of the Droplet, Terraform will show its full definition in its output. Plan the configuration, but pipe it to tfmask
to censor variables according to the regex expression:
- terraform plan -var "do_token=${DO_PAT}" | tfmask
You’ll receive output similar to the following:
Outputdigitalocean_droplet.web: Refreshing state... [id=216419273]
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# digitalocean_droplet.web will be updated in-place
~ resource "digitalocean_droplet" "web" {
id = "254368889"
~ name = "**********************************"
tags = []
# (21 unchanged attributes hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
...
Note that tfmask
has censored the values for name
, ipv4_address
, and ipv4_address_private
using the character you specified in the TFMASK_CHAR
environment variable, because they match the regex expression.
This way of value censoring in the Terraform logs is very useful for CI/CD, where the logs may be publicly available. The benefit of tfmask
is that you have full control over what variables to censor (using the regex expression). You can also specify keywords that you want to censor, which may not currently exist, but which you anticipate using in the future.
You can destroy the deployed resources by running the following command and entering yes
when prompted:
- terraform destroy -var "do_token=${DO_PAT}"
In this article, you’ve worked with a couple of ways to hide and secure sensitive data in your Terraform project. The first measure, using sensitive
to hide values from the outputs and variables, is useful when only logs are accessible, but the values themselves can stay present in the state stored on disk.
To remedy that, you can opt to store the state file remotely, which you’ve achieved with DigitalOcean Spaces. This allows you to make use of encryption at rest. You also used tfmask
, a tool that censors values of variables—matched using a regex expression—during terraform plan
and terraform apply
.
You can also check out Hashicorp Vault to store secrets and secret data. It can be integrated with Terraform to inject secrets in resource definitions, so you’ll be able to connect your project with your existing Vault workflow. You may want to check out our tutorial on How To Build a Hashicorp Vault Server Using Packer and Terraform on DigitalOcean.
This tutorial is part of the How To Manage Infrastructure with Terraform series. The series covers a number of Terraform topics, from installing Terraform for the first time to managing complex projects.
]]>One of the main benefits of Infrastructure as Code (IAC) is reusing parts of the defined infrastructure. In Terraform, you can use modules to encapsulate logically connected components into one entity and customize them using input variables you define. By using modules to define your infrastructure at a high level, you can separate development, staging, and production environments by only passing in different values to the same modules, which minimizes code duplication and maximizes conciseness.
You are not limited to using only your custom modules. Terraform Registry is integrated into Terraform and lists modules and providers that you can incorporate in your project right away by defining them in the required_providers
section. Referencing public modules can speed up your workflow and reduce code duplication. If you have a useful module and would like to share it with the world, you can look into publishing it on the Registry for other developers to use.
In this tutorial, you’ll explore some of the ways to define and reuse code in Terraform projects. You’ll reference modules from the Terraform Registry, separate development and production environments using modules, learn about templates and how they are used, and specify resource dependencies explicitly using the depends_on
meta argument.
terraform-reusability
, instead of loadbalance
. During Step 2, do not include the pvt_key
variable and the SSH key resource.droplet-lb
module available under modules
in terraform-reusability
. Follow the How to Build a Custom Module tutorial and work through it until the droplet-lb
module is functionally complete. (That is, until the cd ../..
command in the Creating a Module section.)Note: This tutorial has been tested using Terraform 1.0.2
.
In this section, you’ll use modules to separate your target deployment environments. You’ll arrange these according to the structure of a more complex project. You’ll create a project with two modules: one will define the Droplets and Load Balancers, and the other will set up the DNS domain records. Afterward, you’ll write configuration for two different environments (dev
and prod
), which will call the same modules.
dns-records
ModuleAs part of the prerequisites, you set up the initial project under terraform-reusability
and created the droplet-lb
module in its own subdirectory under modules
. You’ll now set up the second module, called dns-records
, containing variables, outputs, and resource definitions. From the terraform-reusability
directory, create dns-records
by running:
- mkdir modules/dns-records
Navigate to it:
- cd modules/dns-records
This module will contain the definitions for your domain and the DNS records that you’ll later point to the Load Balancers. You’ll first define the variables, which will become inputs that this module will expose. You’ll store them in a file called variables.tf
. Create it for editing:
- nano variables.tf
Add the following variable definitions:
variable "domain_name" {}
variable "ipv4_address" {}
Save and close the file. You’ll now define the domain and the accompanying A
and CNAME
records in a file named records.tf
. Create and open it for editing by running:
- nano records.tf
Add the following resource definitions:
resource "digitalocean_domain" "domain" {
name = var.domain_name
}
resource "digitalocean_record" "domain_A" {
domain = digitalocean_domain.domain.name
type = "A"
name = "@"
value = var.ipv4_address
}
resource "digitalocean_record" "domain_CNAME" {
domain = digitalocean_domain.domain.name
type = "CNAME"
name = "www"
value = "@"
}
First, you add the domain name to your DigitalOcean account. The cloud will automatically add the three DigitalOcean nameservers as NS
records. The domain name you supply to Terraform must not already be present in your DigitalOcean account, or Terraform will show an error during infrastructure creation.
Then, you define an A
record for your domain, routing it (the @
as value
signifies the true domain name, without subdomains) to the IP address supplied as the variable ipv4_address
. The actual IP address will be passed in when you initialize an instance of the module. For the sake of completeness, the CNAME
record that follows specifies that the www
subdomain should also point to the same domain. Save and close the file when you’re done.
Next, you’ll define the outputs for this module. The outputs will show the FQDN (fully qualified domain name) of the created records. Create and open outputs.tf
for editing:
- nano outputs.tf
Add the following lines:
output "A_fqdn" {
value = digitalocean_record.domain_A.fqdn
}
output "CNAME_fqdn" {
value = digitalocean_record.domain_CNAME.fqdn
}
Save and close the file when you’re done.
With the variables, DNS records, and outputs defined, the last thing you’ll need to specify are the provider requirements for this module. You’ll specify that the dns-records
module requires the digitalocean
provider in a file called provider.tf
. Create and open it for editing:
- nano provider.tf
Add the following lines:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
When you’re done, save and close the file. Now that the digitalocean
provider has been defined, the dns-records
module is functionally complete.
The current structure of the terraform-reusability
project will look similar to this:
terraform-reusability/
├─ modules/
│ ├─ dns-records/
│ │ ├─ outputs.tf
│ │ ├─ provider.tf
│ │ ├─ records.tf
│ │ ├─ variables.tf
│ ├─ droplet-lb/
│ │ ├─ droplets.tf
│ │ ├─ lb.tf
│ │ ├─ outputs.tf
│ │ ├─ provider.tf
│ │ ├─ variables.tf
├─ provider.tf
So far, you have two modules in your project: the one you just created (dns-records
) and the one you created as part of the prerequisites (droplet-lb
).
To facilitate different environments, you’ll store the dev
and prod
environment config files under a directory called environments
, which will reside in the root of the project. Both environments will call the same two modules, but with different parameter values. The advantage of this is when the modules change internally in the future, you’ll only need to update the values you are passing in.
First, navigate to the root of the project by running:
- cd ../..
Then, create the dev
and prod
directories under environments
at the same time:
- mkdir -p environments/dev && mkdir environments/prod
The -p
argument orders mkdir
to create all directories in the given path.
Navigate to the dev
directory, as you’ll first configure that environment:
- cd environments/dev
You’ll store the code in a file named main.tf
, so create it for editing:
- nano main.tf
Add the following lines:
module "droplets" {
source = "../../modules/droplet-lb"
droplet_count = 2
group_name = "dev"
}
module "dns" {
source = "../../modules/dns-records"
domain_name = "your_dev_domain"
ipv4_address = module.droplets.lb_ip
}
Here you call and configure the two modules, droplet-lb
and dns-records
, which will together result in the creation of two Droplets. They’re fronted by a Load Balancer, and the DNS records for the supplied domain are set up to point to that Load Balancer. Remember to replace your_dev_domain
with your desired domain name for the dev
environment, then save and close the file.
Next, you’ll configure the DigitalOcean provider and create a variable for it to be able to accept the personal access token you’ve created as part of the prerequisites. Open a new file, called provider.tf
, for editing:
- nano provider.tf
Add the following lines:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
variable "do_token" {}
provider "digitalocean" {
token = var.do_token
}
In this code, you require the digitalocean
provider to be available and to pass in the do_token
variable to its instance. Save and close the file.
Initialize the configuration by running:
- terraform init
You’ll receive the following output:
OutputInitializing modules...
- dns in ../../modules/dns-records
- droplets in ../../modules/droplet-lb
Initializing the backend...
Initializing provider plugins...
- Finding digitalocean/digitalocean versions matching "~> 2.0"...
- Installing digitalocean/digitalocean v2.10.1...
- Installed digitalocean/digitalocean v2.10.1 (signed by a HashiCorp partner, key ID F82037E524B9C0E8)
Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
The configuration for the prod
environment is similar. Navigate to its directory by running:
- cd ../prod
Create and open main.tf
for editing:
- nano main.tf
Add the following lines:
module "droplets" {
source = "../../modules/droplet-lb"
droplet_count = 5
group_name = "prod"
}
module "dns" {
source = "../../modules/dns-records"
domain_name = "your_prod_domain"
ipv4_address = module.droplets.lb_ip
}
The difference between this and your dev
code is that there will be five Droplets deployed. Furthermore, the domain name, which you should replace with your prod
domain name, will be different. Save and close the file when you’re done.
Then, copy over the provider configuration from dev
:
- cp ../dev/provider.tf .
Initialize this configuration as well:
- terraform init
The output of this command will be the same as the previous time you ran it.
You can try planning the configuration to see what resources Terraform would create by running:
- terraform plan -var "do_token=${DO_PAT}"
The output for prod
will be the following:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
Terraform will perform the following actions:
# module.dns.digitalocean_domain.domain will be created
+ resource "digitalocean_domain" "domain" {
+ id = (known after apply)
+ name = "your_prod_domain"
+ urn = (known after apply)
}
# module.dns.digitalocean_record.domain_A will be created
+ resource "digitalocean_record" "domain_A" {
+ domain = "your_prod_domain"
+ fqdn = (known after apply)
+ id = (known after apply)
+ name = "@"
+ ttl = (known after apply)
+ type = "A"
+ value = (known after apply)
}
# module.dns.digitalocean_record.domain_CNAME will be created
+ resource "digitalocean_record" "domain_CNAME" {
+ domain = "your_prod_domain"
+ fqdn = (known after apply)
+ id = (known after apply)
+ name = "www"
+ ttl = (known after apply)
+ type = "CNAME"
+ value = "@"
}
# module.droplets.digitalocean_droplet.droplets[0] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "prod-0"
...
}
# module.droplets.digitalocean_droplet.droplets[1] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "prod-1"
...
}
# module.droplets.digitalocean_droplet.droplets[2] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "prod-2"
...
}
# module.droplets.digitalocean_droplet.droplets[3] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "prod-3"
...
}
# module.droplets.digitalocean_droplet.droplets[4] will be created
+ resource "digitalocean_droplet" "droplets" {
...
+ name = "prod-4"
...
}
# module.droplets.digitalocean_loadbalancer.www-lb will be created
+ resource "digitalocean_loadbalancer" "www-lb" {
...
+ name = "lb-prod"
...
Plan: 9 to add, 0 to change, 0 to destroy.
...
This would deploy five Droplets with a Load Balancer. It would also create the prod
domain you specified with the two DNS records pointing to the Load Balancer. You can try planning the configuration for the dev
environment as well—you’ll note that two Droplets would be planned for deployment.
Note: You can apply this configuration for the dev
and prod
environments with the following command:
- terraform apply -var "do_token=${DO_PAT}"
To destroy it, run the following command and input yes
when prompted:
- terraform destroy -var "do_token=${DO_PAT}"
The following demonstrates how you have structured this project:
terraform-reusability/
├─ environments/
│ ├─ dev/
│ │ ├─ main.tf
│ │ ├─ provider.tf
│ ├─ prod/
│ │ ├─ main.tf
│ │ ├─ provider.tf
├─ modules/
│ ├─ dns-records/
│ │ ├─ outputs.tf
│ │ ├─ provider.tf
│ │ ├─ records.tf
│ │ ├─ variables.tf
│ ├─ droplet-lb/
│ │ ├─ droplets.tf
│ │ ├─ lb.tf
│ │ ├─ outputs.tf
│ │ ├─ provider.tf
│ │ ├─ variables.tf
├─ provider.tf
The addition is the environments
directory, which holds the code for the dev
and prod
environments.
The benefit of this approach is that further changes to modules automatically propagate to all areas of your project. Barring any possible customizations to module inputs, this approach is not repetitive and promotes reusability as much as possible, even across deployment environments. Overall, this reduces clutter and allows you to trace the modifications using a version-control system.
In the final two sections of this tutorial, you’ll review the depends_on
meta argument and the templatefile
function.
While planning actions, Terraform automatically tries to identify existing dependencies and builds them into its dependency graph. The main dependencies it can detect are clear references; for example, when an output value of a module is passed to a parameter on another resource. In this scenario, the module must first complete its deployment to provide the output value.
The dependencies that Terraform can’t detect are hidden—they have side effects and mutual references not inferable from the code. An example of this is when an object depends not on the existence, but on the behavior of another one, and does not access its attributes from code. To overcome this, you can use depends_on
to manually specify the dependencies in an explicit way. Since Terraform 0.13
, you can also use depends_on
on modules to force the listed resources to be fully deployed before deploying the module itself. It’s possible to use the depends_on
meta argument with every resource type. depends_on
will also accept a list of other resources on which its specified resource depends.
depends_on
accepts a list of references to other resources. Its syntax looks like this:
resource "resource_type" "res" {
depends_on = [...] # List of resources
# Parameters...
}
Remember that you should only use depends_on
as a last-resort option. If used, it should be kept well documented, because the behavior that the resources depend on may not be immediately obvious.
In the previous step of this tutorial, you haven’t specified any explicit dependencies using depends_on
, because the resources you’ve created have no side effects not inferable from the code. Terraform is able to detect the references made from the code you’ve written, and will schedule the resources for deployment accordingly.
In Terraform, templating is substituting results of expressions in appropriate places, such as when setting attribute values on resources or constructing strings. You’ve used it in the previous steps and the tutorial prerequisites to dynamically generate Droplet names and other parameter values.
When substituting values in strings, the values are specified and surrounded by ${}
. Template substitution is often used in loops to facilitate customization of the created resources. It also allows for module customization by substituting inputs in resource attributes.
Terraform offers the templatefile
function, which accepts two arguments: the file from the disk to read and a map of variables paired with their values. The value it returns is the contents of the file rendered with the expression substituted—just as Terraform would normally do when planning or applying the project. Because functions are not part of the dependency graph, the file cannot be dynamically generated from another part of the project.
Imagine that the contents of the template file called droplets.tmpl
is as follows:
%{ for address in addresses ~}
${address}:80
%{ endfor ~}
Longer declarations must be surrounded with %{}
, as is the case with the for
and endfor
declarations, which signify the start and end of the for
loop respectively. The contents and type of the droplets
variable are not known until the function is called and actual values provided, like so:
templatefile("${path.module}/droplets.tmpl", { addresses = ["192.168.0.1", "192.168.1.1"] })
This templatefile
call will return the following value:
Output192.168.0.1:80
192.168.1.1:80
This function has its use cases, but they are uncommon. For example, you could use it when part of the configuration must exist in a proprietary format, but is dependent on the rest of the values and must be generated dynamically. In the majority of cases, it’s better to specify all configuration parameters directly in Terraform code, where possible.
In this article, you’ve maximized code reuse in an example Terraform project. The main way is to package often-used features and configurations as a customizable module and use it whenever needed. By doing so, you do not duplicate the underlying code (which can be error prone) and enable faster turnaround times, since modifying the module is almost all you need to do to introduce changes.
You’re not limited to your own modules. As you’ve seen, Terraform Registry provides third-party modules and providers that you can incorporate in your project.
This tutorial is part of the How To Manage Infrastructure with Terraform series. The series covers a number of Terraform topics, from installing Terraform for the first time to managing complex projects.
]]>Terraform outputs are used to extract information about the infrastructure resources from the project state. Using other features of the Hashicorp Configuration Language (HCL), which Terraform uses, resource information can be queried and transformed into more complex data structures, such as lists and maps. Outputs are useful for providing information to external software, which can operate on the created infrastructure resources.
In this tutorial, you’ll learn about Terraform output syntax and its parameters by creating a simple infrastructure that deploys Droplets. You’ll also parse the outputs programmatically by converting them to JSON.
terraform-outputs
, instead of loadbalance
. During Step 2, do not include the pvt_key
variable and the SSH key resource.Note: This tutorial has specifically been tested with Terraform 1.0.2
.
In this section, you’ll declare a Droplet, deploy it to the cloud, and learn about outputs by defining one that will show the Droplet’s IP address.
From the terraform-outputs
directory you created as a prerequisite, create and open the droplets.tf
file for editing:
- nano droplets.tf
Add the following Droplet resource and output definition:
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = "test-droplet"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_address" {
value = digitalocean_droplet.web.ipv4_address
}
You first declare a Droplet resource, called web
. Its actual name in the cloud will be test-droplet
, in the region fra1
, running Ubuntu 20.04.
Then, you declare an output called droplet_ip_address
. In Terraform, outputs are used to export and show internal and computed values and information about the resources. Here, you set the value
parameter, which accepts the data to output, to the IP address of the declared Droplet. At declare time, it’s unknown, but it will become available once the Droplet is deployed. Outputs are shown and accessible after each deployment.
Save and close the file, then deploy the project by running the following command:
- terraform apply -var "do_token=${DO_PAT}"
Enter yes
to apply when prompted. The end of the output will be similar to this:
Output...
digitalocean_droplet.web: Creating...
...
digitalocean_droplet.web: Creation complete after 32s [id=207631771]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Outputs:
droplet_ip_address = ip_address
The highlighted IP address belongs to your newly deployed Droplet. Applying the project deploys the resources to the cloud and shows the outputs at the end, when all resource attributes are available. Without the droplet_ip_address
output, Terraform would show no further information about the Droplet, except that it’s deployed.
Outputs can also be shown using the output
command:
- terraform output
The output will list all outputs
in the project:
Outputdroplet_ip_address = ip_address
You can also query a specific output by name by specifying it as an argument:
- terraform output output_name
For droplet_ip_address
, the output will consist of the IP address only:
Outputip_address
Except for specifying the mandatory value
, outputs have a few optional parameters:
description
: embeds short documentation detailing what the output shows.depends_on
: a meta parameter available at each resource that allows you to explicitly specify resources the output depends on that Terraform is not able to automatically deduce during planning.sensitive
: accepts a boolean value, which prevents the content of the output from being shown after deploying if set to true
.The sensitive
parameter is useful when the logs of the Terraform deployment will be publicly available, but the output contents should be kept hidden. You’ll now add it to your Droplet resource definition.
Open droplets.tf
for editing and add the highlighted line:
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = "test-droplet"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_address" {
value = digitalocean_droplet.web.ipv4_address
sensitive = true
}
Save and close the file when you’re done. Deploy the project again by running:
- terraform apply -var "do_token=${DO_PAT}"
Enter yes
when prompted. You’ll see that the output is redacted:
Output...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
droplet_ip_address = <sensitive>
Even if it’s marked as sensitive
, the output and its contents will still be available through other channels, such as viewing the Terraform state or querying the outputs directly.
In the next step, you’ll create a different Droplet and output structure, so destroy the currently deployed ones by running:
- terraform destroy -var "do_token=${DO_PAT}"
The output at the very end will be:
Output...
Destroy complete! Resources: 1 destroyed.
You’ve declared and deployed a Droplet and created an output that shows its IP address. You’ll now learn about using outputs to show more complex structures such as lists and maps.
In this section, you’ll deploy multiple Droplets from the same definition using the count
keyword, and output their IP addresses in various formats.
for
loopYou’ll need to modify the Droplet resource definition, so open it for editing:
- nano droplets.tf
Modify it to look like this:
resource "digitalocean_droplet" "web" {
count = 3
image = "ubuntu-20-04-x64"
name = "test-droplet-${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
}
You’ve specified that three Droplets should be created using the count
key and added the current index to the Droplet name, so that you’ll be able to later discern between them. Remove the existing output below. When you’re done, save and close the file.
Apply the code by running:
- terraform apply -var "do_token=${DO_PAT}"
Terraform will plan the creation of three numbered Droplets, called test-droplet-0
, test-droplet-1
, and test-droplet-2
. Enter yes
when prompted to finish the process. You’ll see the following output in the end:
Output...
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
This means that all three Droplets are successfully deployed and that all information about them is stored in the project state.
The easiest way to access their resource attributes is to use outputs, but creating one for each Droplet is not scalable. The solution is to use the for
loop to traverse through the list of Droplets and gather their attributes, or to alternatively use splat expressions (which you’ll learn about later in this step).
You’ll first define an output that will output the IP addresses of the three Droplets, paired with their names. Open droplets.tf
for editing:
- nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "web" {
count = 3
image = "ubuntu-20-04-x64"
name = "test-droplet-${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_addresses" {
value = {
for droplet in digitalocean_droplet.web:
droplet.name => droplet.ipv4_address
}
}
The output value of droplet_ip_addresses
is constructed using a for
loop. Because it’s surrounded by braces, the resulting type will be a map. The loop traverses the list of Droplets, and for each instance, pairs its name with its IP address and appends it to the resulting map.
Save and close the file, then apply the project again:
- terraform apply -var "do_token=${DO_PAT}"
Enter yes
when prompted and you’ll receive the output contents at the end:
OutputApply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
droplet_ip_addresses = {
"test-droplet-0" = "ip_address"
"test-droplet-1" = "ip_address"
"test-droplet-2" = "ip_address"
}
The droplet_ip_addresses
output details the IP addresses of the three deployed droplets.
Using the Terraform output
command, you can get the contents of the output as JSON using its command argument:
- terraform output -json droplet_ip_addresses
The result will be similar to the following:
Output{"test-droplet-0":"ip_address","test-droplet-1":"ip_address","test-droplet-2":"ip_address"}
JSON parsing is widely used and supported in many programming languages. This way, you can programmatically parse the information about the deployed Droplet resources.
Splat expressions offer a compact way of iterating over all elements of a list, and collecting contents of an attribute from each of them, resulting in a list. A splat expression that would extract the IP addresses of the three deployed droplets would have the following syntax:
digitalocean_droplet.web[*].ipv4_address
The [*]
symbol traverses the list on its left and for each of the elements, takes the contents of its attribute specified on the right. If the reference on the left is not a list by itself, it will be converted to one where it will be the sole element.
You can open droplets.tf
for editing and modify the following lines to implement this:
resource "digitalocean_droplet" "web" {
count = 3
image = "ubuntu-20-04-x64"
name = "test-droplet-${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
}
output "droplet_ip_addresses" {
value = digitalocean_droplet.web[*].ipv4_address
}
After saving the file, apply the project by running the following command:
- terraform apply -var "do_token=${DO_PAT}"
You’ll receive output that is now a list, and which contains only the IP addresses of the Droplets:
Output...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
droplet_ip_addresses = [
"ip_address",
"ip_address",
"ip_address",
]
To receive the output as JSON, run the following command:
- terraform output -json droplet_ip_addresses
The output will be a single array:
Output["ip_address","ip_address","ip_address"]
You’ve used outputs together with splat expressions and for
loops to export the IP addresses of the deployed Droplets. You’ve also received the output contents as JSON, and you’ll now use jq
—a tool for dynamically filtering JSON according to given expressions—to parse them.
jq
In this step, you’ll install and learn the basics of jq
, a tool for manipulating JSON documents. You’ll use it to parse the outputs of your Terraform project.
If you’re on Ubuntu, run the following command to install jq
:
- sudo snap install jq
On macOS, you can use Homebrew to install it:
- brew install jq
jq
applies the provided processing expression on given input, which can be piped in. The easiest task in jq
is to pretty print the input:
- terraform output -json droplet_ip_addresses | jq '.'
Passing in the identity operator (.
) means that the whole JSON document parsed from the input should be outputted without modifications:
Output[
"first_ip_address",
"second_ip_address",
"third_ip_address"
]
You can request just the second IP address using the array bracket notation, counting from zero:
- terraform output -json droplet_ip_addresses | jq '.[1]'
The output will be:
Output"second_ip_address"
To make the result of the processing an array, wrap the expression in brackets:
- terraform output -json droplet_ip_addresses | jq '[.[1]]'
You’ll get a pretty printed JSON array:
Output[
"second_ip_address"
]
You can retrieve parts of arrays instead of single elements by specifying a range of indexes inside the brackets:
- terraform output -json droplet_ip_addresses | jq '.[0:2]'
The output will be:
Output[
"first_ip_address",
"second_ip_address"
]
The range 0:2
returns the first two elements—the upper part of the range (2
) is not inclusive, so only elements at positions 0
and 1
are fetched.
You can now destroy the deployed resources by running:
- terraform destroy -var "do_token=${DO_PAT}"
In this step, you have installed jq
and used it to parse and manipulate the output of your Terraform project, which deploys three Droplets.
You have learned about Terraform outputs, using them to show details about the deployed resources and to export data structures for later external processing. You’ve also used outputs to show attributes of a single resource, as well as for showing constructed maps and lists containing resource attributes.
For more detailed information about the features of jq
, visit the official docs.
This tutorial is part of the How To Manage Infrastructure with Terraform series. The series covers a number of Terraform topics, from installing Terraform for the first time to managing complex projects.
]]>Hashicorp Configuration Language (HCL), which Terraform uses, provides many useful structures and capabilities that are present in other programming languages. Using loops in your infrastructure code can greatly reduce code duplication and increase readability, allowing for easier future refactoring and greater flexibility. HCL also provides a few common data structures, such as lists and maps (also called arrays and dictionaries respectively in other languages), as well as conditionals for execution path branching.
Unique to Terraform is the ability to manually specify the resources one depends on. While the execution graph it builds when running your code already contains the detected links (which are correct in most scenarios), you may find yourself in need of forcing a dependency relationship that Terraform was unable to detect.
In this article, we’ll review the data structures HCL provides, its looping features for resources (the count
key, for_each
, and for
), conditionals for handling known and unknown values, and dependency relationships between resources.
terraform-flexibility
, instead of loadbalance
. During Step 2, you do not need to include the pvt_key
variable and the SSH key resource when you configure the provider.Note: This tutorial has specifically been tested with Terraform 1.0.2
.
Before you learn more about loops and other features of HCL that make your code more flexible, we’ll first go over the available data types and their uses.
The Hashicorp Configuration Language supports primitive and complex data types. Primitive data types are strings, numbers, and boolean values, which are the basic types that cannot be derived from others. Complex types, on the other hand, group multiple values into a single one. The two types of complex values are structural and collection types.
Structural types allow values of different types to be grouped together. The main example is the resource definitions you use to specify what your infrastructure will look like. Compared to the structural types, collection types also group values, but only ones of the same type. The three collection types available in HCL that we are interested in are lists, maps, and sets.
Lists are similar to arrays in other programming languages. They contain a known number of elements of the same type, which can be accessed using the array notation ([]
) by their whole-number index, starting from 0. Here is an example of a list variable declaration holding names of Droplets you’ll deploy in the next steps:
variable "droplet_names" {
type = list(string)
default = ["first", "second", "third"]
}
For the type
, you specified that it’s a list whose element type is string, and then provided its default
value. In HCL, values enumerated in brackets signify a list.
Maps are collections of key-value pairs, where each value is accessed using its key of type string
. There are two ways to specify maps inside curly brackets: by using colons (:
) or equal signs (=
) for specifying values. In both situations, the value must be enclosed with quotes. When using colons, the key must also be enclosed.
The following map definition containing Droplet names for different environments is written using the equal sign:
variable "droplet_env_names" {
type = map(string)
default = {
development = "dev-droplet"
staging = "staging-droplet"
production = "prod-droplet"
}
}
If the key starts with a number, you must use the colon syntax:
variable "droplet_env_names" {
type = map(string)
default = {
"1-development": "dev-droplet"
"2-staging": "staging-droplet"
"3-production": "prod-droplet"
}
}
Sets do not support element ordering, meaning that traversing sets is not guaranteed to yield the same order each time and that their elements cannot be accessed in a targeted way. They contain unique elements repeated exactly once, and specifying the same element multiple times will result in them being coalesced with only one instance being present in the set.
Declaring a set is similar to declaring a list, the only difference being the type of the variable:
variable "droplet_names" {
type = set(string)
default = ["first", "second", "third", "fourth"]
}
Now that you’ve learned about the types of data structures HCL offers and reviewed the syntax of lists, maps, and sets, which we’ll use throughout this tutorial, you’ll move on to trying some flexible ways of deploying multiple instances of the same resource in Terraform.
count
KeyIn this section, you’ll create multiple instances of the same resource using the count
key. The count
key is a parameter available on all resources that specifies how many instances to create.
You’ll see how it works by writing a Droplet resource, which you’ll store in a file named droplets.tf
in the project directory you created as part of the prerequisites. Create and open it for editing by running:
- nano droplets.tf
Add the following lines:
resource "digitalocean_droplet" "test_droplet" {
count = 3
image = "ubuntu-20-04-x64"
name = "web"
region = "fra1"
size = "s-1vcpu-1gb"
}
This code defines a Droplet resource called test_droplet
, running Ubuntu 20.04 with 1GB RAM and a CPU core.
Note that the value of count
is set to 3
, which means that Terraform will attempt to create three instances of the same resource. When you are done, save and close the file.
You can plan the project to see what actions Terraform would take by running:
- terraform plan -var "do_token=${DO_PAT}"
The output will be similar to this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.test_droplet[0] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
name = "web"
...
}
# digitalocean_droplet.test_droplet[1] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
name = "web"
...
}
# digitalocean_droplet.test_droplet[2] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
name = "web"
...
}
Plan: 3 to add, 0 to change, 0 to destroy.
...
The output details that Terraform would create three instances of test_droplet
, all with the same name web
. While possible, it is not preferred, so let’s modify the Droplet definition to make the name of each instance unique. Open droplets.tf
for editing:
- nano droplets.tf
Modify the highlighted line:
resource "digitalocean_droplet" "test_droplet" {
count = 3
image = "ubuntu-20-04-x64"
name = "web.${count.index}"
region = "fra1"
size = "s-1vcpu-1gb"
}
Save and close the file.
The count
object provides the index
parameter, which contains the index of the current iteration, starting from 0. The current index is substituted into the name of the Droplet using string interpolation, which allows you to dynamically build a string by substituting variables. You can plan the project again to see the changes:
- terraform plan -var "do_token=${DO_PAT}"
The output will be similar to this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.test_droplet[0] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
name = "web.0"
...
}
# digitalocean_droplet.test_droplet[1] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
name = "web.1"
...
}
# digitalocean_droplet.test_droplet[2] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
name = "web.2"
...
}
Plan: 3 to add, 0 to change, 0 to destroy.
...
This time, the three instances of test_droplet
will have their index in their names, making them easier to track.
You now know how to create multiple instances of a resource using the count
key, as well as fetch and use the index of an instance during provisioning. Next, you’ll learn how to fetch the Droplet’s name from a list.
In situations when multiple instances of the same resource need to have custom names, you can dynamically retrieve them from a list variable you define. During the rest of the tutorial, you’ll see several ways of automating Droplet deployment from a list of names, promoting flexibility and ease of use.
You’ll first need to define a list containing the Droplet names. Create a file called variables.tf
and open it for editing:
- nano variables.tf
Add the following lines:
variable "droplet_names" {
type = list(string)
default = ["first", "second", "third", "fourth"]
}
Save and close the file. This code defines a list called droplet_names
, containing the strings first
, second
, third
, and fourth
.
Open droplets.tf
for editing:
- nano droplets.tf
Modify the highlighted lines:
resource "digitalocean_droplet" "test_droplet" {
count = length(var.droplet_names)
image = "ubuntu-20-04-x64"
name = var.droplet_names[count.index]
region = "fra1"
size = "s-1vcpu-1gb"
}
To improve flexibility, instead of manually specifying a constant number of elements, you pass in the length of the droplet_names
list to the count
parameter, which will always return the number of elements in the list. For the name, you fetch the element of the list positioned at count.index
, using the array bracket notation. Save and close the file when you’re done.
Try planning the project again. You’ll receive output similar to this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.test_droplet[0] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "first"
...
}
# digitalocean_droplet.test_droplet[1] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "second"
...
}
# digitalocean_droplet.test_droplet[2] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "third"
...
}
# digitalocean_droplet.test_droplet[3] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "fourth"
...
Plan: 4 to add, 0 to change, 0 to destroy.
...
As a result of these modifications, four Droplets would be deployed, successively named after the elements of the droplet_names
list.
You’ve learned about count
, its features and syntax, and have used it together with a list to modify the resource instances. You’ll now see its disadvantages, and how to overcome them.
count
Now that you know how count is used, let’s examine its disadvantages when modifying the list it’s used with.
Let’s try deploying the Droplets to the cloud:
- terraform apply -var "do_token=${DO_PAT}"
Enter yes
when prompted. The end of your output will be similar to this:
OutputApply complete! Resources: 4 added, 0 changed, 0 destroyed.
Now let’s create one more Droplet instance by enlarging the droplet_names
list. Open variables.tf
for editing:
- nano variables.tf
Add a new element to the beginning of the list:
variable "droplet_names" {
type = list(string)
default = ["zero", "first", "second", "third", "fourth"]
}
When you’re done, save and close the file.
Plan the project:
- terraform plan -var "do_token=${DO_PAT}"
You’ll receive output like this:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
+ create
~ update in-place
Terraform will perform the following actions:
# digitalocean_droplet.test_droplet[0] will be updated in-place
~ resource "digitalocean_droplet" "test_droplet" {
...
~ name = "first" -> "zero"
...
}
# digitalocean_droplet.test_droplet[1] will be updated in-place
~ resource "digitalocean_droplet" "test_droplet" {
...
~ name = "second" -> "first"
...
}
# digitalocean_droplet.test_droplet[2] will be updated in-place
~ resource "digitalocean_droplet" "test_droplet" {
...
~ name = "third" -> "second"
...
}
# digitalocean_droplet.test_droplet[3] will be updated in-place
~ resource "digitalocean_droplet" "test_droplet" {
...
~ name = "fourth" -> "third"
...
}
# digitalocean_droplet.test_droplet[4] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "fourth"
...
}
Plan: 1 to add, 4 to change, 0 to destroy.
...
The output shows that Terraform would rename the first four Droplets and create a fifth one called fourth
, because it considers the instances as an ordered list and identifies the elements (Droplets) by their index number in the list. This is how Terraform initially considers the four Droplets:
Index Number | 0 | 1 | 2 | 3 |
---|---|---|---|---|
Droplet Name | first | second | third | fourth |
When the new Droplet zero
is added to the beginning, its internal list representation looks like this:
Index Number | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
Droplet Name | zero | first | second | third | fourth |
The four initial Droplets are now shifted one place to the right. Terraform then compares the two states represented in tables: at position 0
, the Droplet was called first, and because it’s different in the second table, it plans an update action. This continues until position 4
, which does not have a comparable element in the first table, and instead a Droplet provisioning action is planned.
This means that adding a new element to the list anywhere but to the very end would result in resources being modified when they do not need to be. Similar update actions would be planned if an element of the droplet_names
list was removed.
Incomplete resource tracking is the main downfall of using count
for deploying a dynamic number of differing instances of the same resource. For a constant number of constant instances, count
is a simple solution that works well. In situations like this, though, when some attributes are being pulled in from a variable, the for_each
loop, which you’ll learn about later in this tutorial, is a much better choice.
self
)Another downside of count
is that referencing an arbitrary instance of a resource by its index is not possible in some cases.
The main example is destroy-time provisioners, which run when the resource is planned to be destroyed. The reason is that the requested instance may not exist (it’s already destroyed) or would create a mutual dependency cycle. In such situations, instead of referring to the object through the list of instances, you can access only the current resource through the self
keyword.
To demonstrate its usage, you’ll now add a destroy-time local provisioner to the test_droplet
definition, which will show a message when run. Open droplets.tf
for editing:
- nano droplets.tf
Add the following highlighted lines:
resource "digitalocean_droplet" "test_droplet" {
count = length(var.droplet_names)
image = "ubuntu-20-04-x64"
name = var.droplet_names[count.index]
region = "fra1"
size = "s-1vcpu-1gb"
provisioner "local-exec" {
when = destroy
command = "echo 'Droplet ${self.name} is being destroyed!'"
}
}
Save and close the file.
The local-exec
provisioner runs a command on the local machine Terraform is running on. Because the when
parameter is set to destroy
, it will run only when the resource is going to be destroyed. The command it runs echoes a string to stdout
, which substitutes the name of the current resource using self.name
.
Because you’ll be creating the Droplets in a different way in the next section, destroy the currently deployed ones by running the following command:
- terraform destroy -var "do_token=${DO_PAT}"
Enter yes
when prompted. You’ll receive the local-exec
provisioner being run four times:
Output...
digitalocean_droplet.test_droplet[0] (local-exec): Executing: ["/bin/sh" "-c" "echo 'Droplet first is being destroyed!'"]
digitalocean_droplet.test_droplet[1] (local-exec): Executing: ["/bin/sh" "-c" "echo 'Droplet second is being destroyed!'"]
digitalocean_droplet.test_droplet[1] (local-exec): Droplet second is being destroyed!
digitalocean_droplet.test_droplet[2] (local-exec): Executing: ["/bin/sh" "-c" "echo 'Droplet third is being destroyed!'"]
digitalocean_droplet.test_droplet[2] (local-exec): Droplet third is being destroyed!
digitalocean_droplet.test_droplet[3] (local-exec): Executing: ["/bin/sh" "-c" "echo 'Droplet fourth is being destroyed!'"]
digitalocean_droplet.test_droplet[3] (local-exec): Droplet fourth is being destroyed!
digitalocean_droplet.test_droplet[0] (local-exec): Droplet first is being destroyed!
...
In this step, you learned the disadvantages of count
. You’ll now learn about the for_each
loop construct, which overcomes them and works on a wider array of variable types.
for_each
In this section, you’ll consider the for_each
loop, its syntax, and how it helps flexibility when defining resources with multiple instances.
for_each
is a parameter available on each resource, but unlike count
, which requires a number of instances to create, for_each
accepts a map or a set. Each element of the provided collection is traversed once and an instance is created for it. for_each
makes the key and value available under the each
keyword as attributes (the pair’s key and value as each.key
and each.value
, respectively). When a set is provided, the key and value will be the same.
Because it provides the current element in the each
object, you won’t have to manually access the desired element as you did with lists. In case of sets, that’s not even possible, as it has no observable ordering internally. Lists can also be passed in, but they must first be converted into a set using the toset
function.
The main advantage of using for_each
, aside from being able to enumerate all three collection data types, is that only the affected elements will be modified, created, or deleted. If you change the order of the elements in the input, no actions will be planned, and if you add, remove, or modify an element from the input, appropriate actions will be planned only for that element.
Let’s convert the Droplet resource from count
to for_each
and see how it works in practice. Open droplets.tf
for editing by running:
- nano droplets.tf
Modify the highlighted lines:
resource "digitalocean_droplet" "test_droplet" {
for_each = toset(var.droplet_names)
image = "ubuntu-20-04-x64"
name = each.value
region = "fra1"
size = "s-1vcpu-1gb"
}
You can remove the local-exec
provisioner. When you’re done, save and close the file.
The first line replaces count
and invokes for_each
, passing in the droplet_names
list in the form of a set using the toset
function, which automatically converts the given input. For the Droplet name, you specify each.value
, which holds the value of the current element from the set of Droplet names.
Plan the project by running:
- terraform plan -var "do_token=${DO_PAT}"
The output will detail steps Terraform would take:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.test_droplet["first"] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "first"
...
}
# digitalocean_droplet.test_droplet["fourth"] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "fourth"
...
}
# digitalocean_droplet.test_droplet["second"] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "second"
...
}
# digitalocean_droplet.test_droplet["third"] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "third"
...
}
# digitalocean_droplet.test_droplet["zero"] will be created
+ resource "digitalocean_droplet" "test_droplet" {
...
+ name = "zero"
...
}
Plan: 5 to add, 0 to change, 0 to destroy.
...
In contrast to using count
, Terraform now considers each instance individually, and not as elements of an ordered list. Each instance is linked to an element of the given set, as signified by the shown string element in the brackets next to each resource that will be created.
Apply the plan to the cloud by running:
- terraform apply -var "do_token=${DO_PAT}"
Enter yes
when prompted. When it finishes, you’ll remove one element from the droplet_names
list to demonstrate that other instances won’t be affected. Open variables.tf
for editing:
- nano variables.tf
Modify the list to look like this:
variable "droplet_names" {
type = list(string)
default = ["first", "second", "third", "fourth"]
}
Save and close the file.
Plan the project again, and you’ll receive the following output:
Output...
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
- destroy
Terraform will perform the following actions:
# digitalocean_droplet.test_droplet["zero"] will be destroyed
- resource "digitalocean_droplet" "test_droplet" {
...
- name = "zero" -> null
...
}
Plan: 0 to add, 0 to change, 1 to destroy.
...
This time, Terraform would destroy only the removed instance (zero
), and would not touch any of the other instances, which is the correct behavior.
In this step, you’ve learned about for_each
, how to use it, and its advantages over count
. Next, you’ll learn about the for
loop, its syntax and usage, and when it can be used to automate certain tasks.
for
The for
loop works on collections, and creates a new collection by applying a transformation to each element of the input. The exact type of the output will depend on whether the loop is surrounded by brackets ([]
) or braces ({}
), which give a list or a map, respectively. As such, it is suitable for querying resources and forming structured outputs for later processing.
The general syntax of the for
loop is:
for element in collection:
transform(element)
if condition
Similar to other programming languages, you first name the traversal variable (element
) and specify the collection
to enumerate. The body of the loop is the transformational step, and the optional if
clause can be used for filtering the input collection.
You’ll now work through a few examples using outputs. You’ll store them in a file named outputs.tf
. Create it for editing by running the following command:
- nano outputs.tf
Add the following lines to output pairs of deployed Droplet names and their IP addresses:
output "ip_addresses" {
value = {
for instance in digitalocean_droplet.test_droplet:
instance.name => instance.ipv4_address
}
}
This code specifies an output called ip_addresses
, and specifies a for
loop that iterates over the instances of the test_droplet
resource you’ve been customizing in the previous steps. Because the loop is surrounded by curly brackets, its output will be a map. The transformational step for maps is similar to lambda functions in other programming languages, and here it creates a key-value pair by combining the instance name as the key with its private IP as its value.
Save and close the file, then refresh the Terraform state to account for the new output by running:
- terraform refresh -var "do_token=${DO_PAT}"
The Terraform refresh
command updates the local state with the actual infrastructure state in the cloud.
Then, check the contents of the outputs:
Outputip_addresses = {
"first" = "ip_address"
"fourth" = "ip_address"
"second" = "ip_address"
"third" = "ip_address"
}
Terraform has shown the contents of the ip_addresses
output, which is a map constructed by the for
loop. (The order of the entries may be different for you.) The loop will work seamlessly for every number of entries—meaning that you can add a new element to the droplet_names
list and the new Droplet, which would be created without any further manual input, would also show up in this output automatically.
By surrounding the for
loop in square brackets, you can make the output a list. For example, you could output only Droplet IP addresses, which is useful for external software that may be parsing the data. The code would look like this:
output "ip_addresses" {
value = [
for instance in digitalocean_droplet.test_droplet:
instance.ipv4_address
]
}
Here, the transformational step selects the IP address attribute. It would give the following output:
Outputip_addresses = [
"ip_address",
"ip_address",
"ip_address",
"ip_address",
]
As was noted before, you can also filter the input collection using the if
clause. Here is how you would write the loop to filter by the fra1
region:
output "ip_addresses" {
value = [
for instance in digitalocean_droplet.test_droplet:
instance.ipv4_address
if instance.region == "fra1"
]
}
In HCL, the ==
operator checks the equality of the values of the two sides—here it checks if instance.region
is equal to fra1
. If it is, the check passes and the instance
is transformed and added to the output, otherwise it is skipped. The output of this code would be the same as the prior example, because all Droplet instances are in the fra1
region, according to the test_droplet
resource definition. The if
conditional is also useful when you want to filter the input collection for other values in your project, like the Droplet size or distribution.
Because you’ll be creating resources differently in the next section, destroy the currently deployed ones by running the following command:
- terraform destroy -var "do_token=${DO_PAT}"
Enter yes
when prompted to finish the process.
We’ve gone over the for
loop, its syntax, and examples of usage in outputs. You’ll now learn about conditionals and how they can be used together with count
.
In one of the previous sections, you’ve seen the count
key and how it works. You’ll now learn about ternary conditional operators, which you can use elsewhere in your Terraform code, and how they can be used with count
.
The syntax of the ternary operator is:
condition ? value_if_true : value_if_false
condition
is an expression that computes to a boolean (true or false). If the condition is true, then the expression evaluates to value_if_true
. On the other hand, if the condition is false, the result will be value_if_false
.
The main use of ternary operators is to enable or disable single resource creation according to the contents of a variable. This can be achieved by passing in the result of the comparison (either 1
or 0
) to the count
key on the desired resource.
In the event that you’re using the ternary operator to fetch a single element from a list or a set, you can use the one
function. If the given collection is empty, it returns null
. Otherwise, it returns the single element in the collection, or throws an error if there are multiple.
Let’s add a variable called create_droplet
, which will control if a Droplet will be created. First, open variables.tf
for editing:
- nano variables.tf
Add the highlighted lines:
variable "droplet_names" {
type = list(string)
default = ["first", "second", "third", "fourth"]
}
variable "create_droplet" {
type = bool
default = true
}
This code defines the create_droplet
variable of type bool
. Save and close the file.
Then, to modify the Droplet declaration, open droplets.tf
for editing by running:
- nano droplets.tf
Modify your file like the following:
resource "digitalocean_droplet" "test_droplet" {
count = var.create_droplet ? 1 : 0
image = "ubuntu-20-04-x64"
name = "test_droplet"
region = "fra1"
size = "s-1vcpu-1gb"
}
For count
, you use a ternary operator to return either 1
if the create_droplet
variable is true, or 0
if false, which will result in no Droplets being provisioned. Save and close the file when you’re done.
Plan the project execution plan with the variable set to false by running:
- terraform plan -var "do_token=${DO_PAT}" -var "create_droplet=false"
You’ll receive the following output:
OutputChanges to Outputs:
+ ip_addresses = {}
You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.
Because create_droplet
was passed in the value of false
, the count
of instances is 0
, and no Droplets will be created, so there will be no IP addresses to output.
You’ve reviewed how to use the ternary conditional operator together with the count
key to enable a higher level of flexibility in choosing whether to deploy desired resources. Next you’ll learn about explicitly setting resource dependencies for your resources.
While creating the execution plan for your project, Terraform detects dependency chains between resources and implicitly orders them so that they will be built in the appropriate order. In the majority of cases, it is able to detect relationships by scanning all expressions in resources and building a graph.
However, when one resource, in order to be provisioned, requires access control settings to already be deployed at the cloud provider, there is no clear sign to Terraform that they are related. In turn, Terraform will not know they are dependent on each other behaviorally. In such cases, the dependency must be manually specified using the depends_on
argument.
The depends_on
key is available on each resource and used to specify the hidden dependency links between specific resources. Hidden dependency relationships form when a resource depends on another one’s behavior, without using any of its data in its declaration, which would prompt Terraform to connect them one way.
Here is an example of how depends_on
is specified in code:
resource "digitalocean_droplet" "droplet" {
image = "ubuntu-20-04-x64"
name = "web"
region = "fra1"
size = "s-1vcpu-1gb"
depends_on = [
# Resources...
]
}
It accepts a list of references to other resources, and it does not accept arbitrary expressions.
depends_on
should be used sparingly, and only when all other options are exhausted. Its use signifies that what you are trying to declare is stepping outside the boundaries of Terraform’s automated dependency detection system; it may signify that the resource is explicitly depending on more resources than it needs to.
You’ve now learned about explicitly setting additional dependencies for a resource using the depends_on
key, and when it should be used.
In this article, we’ve gone over the features of HCL that improve the flexibility and scalability of your code, such as count
for specifying the number of resource instances to deploy and for_each
as an advanced way of looping over collection data types and customizing instances. When used correctly, they greatly reduce code duplication and the operational overhead of managing the deployed infrastructure.
You’ve also learned about conditionals and ternary operators, and how they can be utilized to control whether a resource will get deployed. While Terraform’s automated dependency analysis system is quite capable, there may be cases where you need to manually specify resource dependencies using the depends_on
key.
This tutorial is part of the How To Manage Infrastructure with Terraform series. The series covers a number of Terraform topics, from installing Terraform for the first time to managing complex projects.
]]>1x DO Database Managed 1vCPU-2GB + 1 failover node 2x Webservers Nginx 1vCPU-2GB 1x GlusterFS with 2 nodes(1vCPU-1GB) with replica 1x DO Load Balancing 1x Droplet 1vCPU 1GB for Redis to handler PHP sessions 1x Elastic Search Server(it is additional for improve the searches)
Digital Ocen Firewall(I’m not sure how to configure it on this infrastructure though.) Things I might use: Spaces Object Storage Spaces CDN
We do not expect to receive a lot of traffic at the beginning although this could grow over time.
I would like to know if I am overdoing it with this initial infrastructure, or if there is a simpler, but equally scalable starting point.
Thanks
]]>Structuring Terraform projects appropriately according to their use cases and perceived complexity is essential to ensure their maintainability and extensibility in day-to-day operations. A systematic approach to properly organizing code files is necessary to ensure that the project remains scalable during deployment and usable to you and your team.
In this tutorial, you’ll learn about structuring Terraform projects according to their general purpose and complexity. Then, you’ll create a project with a simple structure using the more common features of Terraform: variables, locals, data sources, and provisioners. In the end, your project will deploy an Ubuntu 20.04 server (Droplet) on DigitalOcean, install an Apache web server, and point your domain to the web server.
A DigitalOcean Personal Access Token, which you can create via the DigitalOcean control panel. You can find instructions in the DigitalOcean product documents, How to Create a Personal Access Token.
A password-less SSH key added to your DigitalOcean account, which you can create by following How To Use SSH Keys with DigitalOcean Droplets.
Terraform installed on your local machine. For instructions according to your operating system, see Step 1 of the How To Use Terraform with DigitalOcean tutorial.
Python 3 installed on your local machine. You can complete Step 1 of How To Install and Set Up a Local Programming Environment for Python 3 for your OS.
A fully registered domain name added to your DigitalOcean account. For instructions on how to do that, visit the official docs.
Note: This tutorial has specifically been tested with Terraform 1.0.2
.
In this section, you’ll learn what Terraform considers a project, how you can structure the infrastructure code, and when to choose which approach. You’ll also learn about Terraform workspaces, what they do, and how Terraform is storing state.
A resource is an entity of a cloud service (such as a DigitalOcean Droplet) declared in Terraform code that is created according to specified and inferred properties. Multiple resources form infrastructure with their mutual connections.
Terraform uses a specialized programming language for defining infrastructure, called Hashicorp Configuration Language (HCL). HCL code is typically stored in files ending with the extension tf
. A Terraform project is any directory that contains tf
files and which has been initialized using the init
command, which sets up Terraform caches and default local state.
Terraform state is the mechanism via which it keeps track of resources that are actually deployed in the cloud. State is stored in backends (locally on disk or remotely on a file storage cloud service or specialized state management software) for optimal redundancy and reliability. You can read more about different backends in the Terraform documentation.
Project workspaces allow you to have multiple states in the same backend, tied to the same configuration. This allows you to deploy multiple distinct instances of the same infrastructure. Each project starts with a workspace named default
—this will be used if you do not explicitly create or switch to another one.
Modules in Terraform (akin to libraries in other programming languages) are parametrized code containers enclosing multiple resource declarations. They allow you to abstract away a common part of your infrastructure and reuse it later with different inputs.
A Terraform project can also include external code files for use with dynamic data inputs, which can parse the JSON output of a CLI command and offer it for use in resource declarations. In this tutorial, you’ll do this with a Python script.
Now that you know what a Terraform project consists of, let’s review two general approaches to Terraform project structuring.
A simple structure is suitable for small and testing projects, with a few resources of varying types and variables. It has a few configuration files, usually one per resource type (or more helper ones together with a main), and no custom modules, because most of the resources are unique and there aren’t enough to be generalized and reused. Following this, most of the code is stored in the same directory, next to each other. These projects often have a few variables (such as an API key for accessing the cloud) and may use dynamic data inputs and other Terraform and HCL features, though not prominently.
As an example of the file structure of this approach, this is what the project you’ll build in this tutorial will look like in the end:
.
└── tf/
├── versions.tf
├── variables.tf
├── provider.tf
├── droplets.tf
├── dns.tf
├── data-sources.tf
└── external/
└── name-generator.py
As this project will deploy an Apache web server Droplet and set up DNS records, the definitions of project variables, the DigitalOcean Terraform provider, the Droplet, and DNS records will be stored in their respective files. The minimum required Terraform and DigitalOcean provider versions will be specified in versions.tf
, while the Python script that will generate a name for the Droplet (and be used as a dynamic data source in data-sources.tf
) will be stored in the external
folder, to separate it from HCL code.
Contrary to the simple structure, this approach is suitable for large projects, with clearly defined subdirectory structures containing multiple modules of varying levels of complexity, aside from the usual code. These modules can depend on each other. Coupled with version control systems, these projects can make extensive use of workspaces. This approach is suitable for larger projects managing multiple apps, while reusing code as much as possible.
Development, staging, quality assurance, and production infrastructure instances can also be housed under the same project in different directories by relying on common modules, thus eliminating duplicate code and making the project the central source of truth. Here is the file structure of an example project with a more complex structure, containing multiple deployment apps, Terraform modules, and target cloud environments:
.
└── tf/
├── modules/
│ ├── network/
│ │ ├── main.tf
│ │ ├── dns.tf
│ │ ├── outputs.tf
│ │ └── variables.tf
│ └── spaces/
│ ├── main.tf
│ ├── outputs.tf
│ └── variables.tf
└── applications/
├── backend-app/
│ ├── env/
│ │ ├── dev.tfvars
│ │ ├── staging.tfvars
│ │ ├── qa.tfvars
│ │ └── production.tfvars
│ └── main.tf
└── frontend-app/
├── env/
│ ├── dev.tfvars
│ ├── staging.tfvars
│ ├── qa.tfvars
│ └── production.tfvars
└── main.tf
This approach is explored further in the series How to Manage Infrastructure with Terraform.
You now know what a Terraform project is, how to best structure it according to perceived complexity, and what role Terraform workspaces serve. In the next steps, you’ll create a project with a simple structure that will provision a Droplet with an Apache web server installed and DNS records set up for your domain. You’ll first initialize your project with the DigitalOcean provider and variables, and then proceed to define the Droplet, a dynamic data source to provide its name, and a DNS record for deployment.
In this section, you’ll add the DigitalOcean Terraform provider to your project, define the project variables, and declare a DigitalOcean provider instance, so that Terraform will be able to connect to your account.
Start off by creating a directory for your Terraform project with the following command:
- mkdir ~/apache-droplet-terraform
Navigate to it:
- cd ~/apache-droplet-terraform
Since this project will follow the simple structuring approach, you’ll store the provider, variables, Droplet, and DNS record code in separate files, per the file structure from the previous section. First, you’ll need to add the DigitalOcean Terraform provider to your project as a required provider.
Create a file named versions.tf
and open it for editing by running:
- nano versions.tf
Add the following lines:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.0"
}
}
}
In this terraform
block, you list the required providers (DigitalOcean, version 2.x
). When you are done, save and close the file.
Then, define the variables your project will expose in the variables.tf
file, following the approach of storing different resource types in separate code files:
- nano variables.tf
Add the following variables:
variable "do_token" {}
variable "domain_name" {}
Save and close the file.
The do_token
variable will hold your DigitalOcean Personal Access Token and domain_name
will specify your desired domain name. The deployed Droplet will have the SSH key, identified by the SSH fingerprint, automatically installed.
Next, let’s define the DigitalOcean provider instance for this project. You’ll store it in a file named provider.tf
. Create and open it for editing by running:
- nano provider.tf
Add the provider:
provider "digitalocean" {
token = var.do_token
}
Save and exit when you’re done. You’ve defined the digitalocean
provider, which corresponds to the required provider you specified earlier in provider.tf
, and set its token to the value of the variable, which will be supplied during runtime.
In this step, you have created a directory for your project, requested the DigitalOcean provider to be available, declared project variables, and set up the connection to a DigitalOcean provider instance to use an authentication token that will be provided later. You’ll now write a script that will generate dynamic data for your project definitions.
Before continuing on to defining the Droplet, you’ll create a Python script that will generate the Droplet’s name dynamically and declare a data source resource to parse it. The name will be generated by concatenating a constant string (web
) with the current time of the local machine, expressed in the UNIX epoch format. A naming script can be useful when multiple Droplets are generated according to a naming scheme, in order to easily differentiate between them.
You’ll store the script in a file named name-generator.py
, in a directory named external
. First, create the directory by running:
- mkdir external
The external
directory resides in the root of your project and will store non-HCL code files, like the Python script you’ll write.
Create name-generator.py
under external
and open it for editing:
- nano external/name-generator.py
Add the following code:
import json, time
fixed_name = "web"
result = {
"name": f"{fixed_name}-{int(time.time())}",
}
print(json.dumps(result))
This Python script imports the json
and time
modules, declares a dictionary named result
, and sets the value of the name
key to an interpolated string, which combines the fixed_name
with the current UNIX time of the machine it runs on. Then, the result
is converted into JSON and outputted on stdout
. The output will be different each time the script is run:
Output{"name": "web-1597747959"}
When you’re done, save and close the file.
Note: Large and complex structured projects require more thought put into how external data sources are created and used, especially in terms of portability and error handling. Terraform expects the executed program to write a human-readable error message to stderr
and gracefully exit with a non-zero status, which is something not shown in this step because of the simplicity of the task. Additionally, it expects the program to have no side effects, so that it can be re-run as many times as needed.
For more info on what Terraform expects, visit the official docs on data sources.
Now that the script is ready, you can define the data source, which will pull the data from the script. You’ll store the data source in a file named data-sources.tf
in the root of your project as per the simple structuring approach.
Create it for editing by running:
- nano data-sources.tf
Add the following definition:
data "external" "droplet_name" {
program = ["python3", "${path.module}/external/name-generator.py"]
}
Save and close the file.
This data source is called droplet_name
and executes the name-generator.py
script using Python 3, which resides in the external
directory you just created. It automatically parses its output and provides the deserialized data under its result
attribute for use within other resource definitions.
With the data source now declared, you can define the Droplet that Apache will run on.
In this step, you’ll write the definition of the Droplet resource and store it in a code file dedicated to Droplets, as per the simple structuring approach. Its name will come from the dynamic data source you have just created, and will be different each time it’s deployed.
Create and open the droplets.tf
file for editing:
- nano droplets.tf
Add the following Droplet resource definition:
data "digitalocean_ssh_key" "ssh_key" {
name = "your_ssh_key_name"
}
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = data.external.droplet_name.result.name
region = "fra1"
size = "s-1vcpu-1gb"
ssh_keys = [
data.digitalocean_ssh_key.ssh_key.id
]
}
You first declare a DigitalOcean SSH key resource called ssh_key
, which will fetch a key from your account by its name. Make sure to replace the highlighted code with your SSH key name.
Then, you declare a Droplet resource, called web
. Its actual name in the cloud will be different, because it’s being requested from the droplet_name
external data source. To bootstrap the Droplet resource with a SSH key each time it’s deployed, the ID of the ssh_key
is passed into the ssh_keys
parameter, so that DigitalOcean will know which key to apply.
For now, this is all you need to configure related to droplet.tf
, so save and close the file when you’re done.
You’ll now write the configuration for the DNS record that will point your domain to the just declared Droplet.
The last step in the process is to configure the DNS record pointing to the Droplet from your domain.
You’ll store the DNS config in a file named dns.tf
, because it’s a separate resource type from the others you have created in the previous steps. Create and open it for editing:
- nano dns.tf
Add the following lines:
resource "digitalocean_record" "www" {
domain = var.domain_name
type = "A"
name = "@"
value = digitalocean_droplet.web.ipv4_address
}
This code declares a DigitalOcean DNS record at your domain name (passed in using the variable), of type A
. The record has a name of @
, which is a placeholder routing to the domain itself and with the Droplet IP address as its value
. You can replace the name
value with something else, which will result in a subdomain being created.
When you’re done, save and close the file.
Now that you’ve configured the Droplet, the name generator data source, and a DNS record, you’ll move on to deploying the project in the cloud.
In this section, you’ll initialize your Terraform project, deploy it to the cloud, and check that everything was provisioned correctly.
Now that the project infrastructure is defined completely, all that is left to do before deploying it is to initialize the Terraform project. Do so by running the following command:
- terraform init
You’ll receive the following output:
OutputInitializing the backend...
Initializing provider plugins...
- Finding digitalocean/digitalocean versions matching "~> 2.0"...
- Finding latest version of hashicorp/external...
- Installing digitalocean/digitalocean v2.10.1...
- Installed digitalocean/digitalocean v2.10.1 (signed by a HashiCorp partner, key ID F82037E524B9C0E8)
- Installing hashicorp/external v2.1.0...
- Installed hashicorp/external v2.1.0 (signed by HashiCorp)
Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
You’ll now be able to deploy your Droplet with a dynamically generated name and an accompanying domain to your DigitalOcean account.
Start by defining the domain name, SSH key fingerprint, and your personal access token as environment variables, so you won’t have to copy the values each time you run Terraform. Run the following commands, replacing the highlighted values:
- export DO_PAT="your_do_api_token"
- export DO_DOMAIN_NAME="your_domain"
You can find your API token in your DigitalOcean Control Panel.
Run the plan
command with the variable values passed in to see what steps Terraform would take to deploy your project:
- terraform plan -var "do_token=${DO_PAT}" -var "domain_name=${DO_DOMAIN_NAME}"
The output will be similar to the following:
OutputTerraform used the selected providers to generate the following execution plan. Resource
actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.web will be created
+ resource "digitalocean_droplet" "web" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "web-1625908814"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "fra1"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ ssh_keys = [
+ "...",
]
+ status = (known after apply)
+ urn = (known after apply)
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
# digitalocean_record.www will be created
+ resource "digitalocean_record" "www" {
+ domain = "your_domain'"
+ fqdn = (known after apply)
+ id = (known after apply)
+ name = "@"
+ ttl = (known after apply)
+ type = "A"
+ value = (known after apply)
}
Plan: 2 to add, 0 to change, 0 to destroy.
...
The lines starting with a green +
signify that Terraform will create each of the resources that follow after—which is exactly what should happen, so you can apply
the configuration:
- terraform apply -var "do_token=${DO_PAT}" -var "domain_name=${DO_DOMAIN_NAME}"
The output will be the same as before, except that this time you’ll be asked to confirm:
OutputPlan: 2 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: `yes`
Enter yes
, and Terraform will provision your Droplet and the DNS record:
Outputdigitalocean_droplet.web: Creating...
...
digitalocean_droplet.web: Creation complete after 33s [id=204432105]
digitalocean_record.www: Creating...
digitalocean_record.www: Creation complete after 1s [id=110657456]
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Terraform has now recorded the deployed resources in its state. To confirm that the DNS records and the Droplet were connected successfully, you can extract the IP address of the Droplet from the local state and check if it matches public DNS records for your domain. Run the following command to get the IP address:
- terraform show | grep "ipv4"
You’ll receive your Droplet’s IP address:
Outputipv4_address = "your_Droplet_IP"
...
You can check the public A records by running:
- nslookup -type=a your_domain | grep "Address" | tail -1
The output will show the IP address to which the A record points:
OutputAddress: your_Droplet_IP
They are the same, as they should be, meaning that the Droplet and DNS record were provisioned successfully.
For the changes in the next step to take place, destroy the deployed resources by running:
- terraform destroy -var "do_token=${DO_PAT}" -var "domain_name=${DO_DOMAIN_NAME}"
When prompted, enter yes
to continue.
In this step, you have created your infrastructure and applied it to your DigitalOcean account. You’ll now modify it to automatically install the Apache web server on the provisioned Droplet using Terraform provisioners.
Now you’ll set up the installation of the Apache web server on your deployed Droplet by using the remote-exec
provisioner to execute custom commands.
Terraform provisioners can be used to execute specific actions on created remote resources (the remote-exec
provisioner) or the local machine the code is executing on (using the local-exec
provisioner). If a provisioner fails, the node will be marked as tainted in current state, which means that it will be deleted and recreated during the next run.
To connect to a provisioned Droplet, Terraform needs the private SSH key of the one set up on the Droplet. The best way to pass in the location of the private key is by using variables, so open variables.tf
for editing:
- nano variables.tf
Add the highlighted line:
variable "do_token" {}
variable "domain_name" {}
variable "private_key" {}
You have now added a new variable, called private_key
, to your project. Save and close the file.
Next, you’ll add the connection data and remote provisioner declarations to your Droplet configuration. Open droplets.tf
for editing by running:
- nano droplets.tf
Extend the existing code with the highlighted lines:
data "digitalocean_ssh_key" "ssh_key" {
name = "your_ssh_key_name"
}
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = data.external.droplet_name.result.name
region = "fra1"
size = "s-1vcpu-1gb"
ssh_keys = [
data.digitalocean_ssh_key.ssh_key.id
]
connection {
host = self.ipv4_address
user = "root"
type = "ssh"
private_key = file(var.private_key)
timeout = "2m"
}
provisioner "remote-exec" {
inline = [
"export PATH=$PATH:/usr/bin",
# Install Apache
"apt update",
"apt -y install apache2"
]
}
}
The connection
block specifies how Terraform should connect to the target Droplet. The provisioner
block contains the array of commands, within the inline
parameter, that it will execute after provisioning. That is, updating the package manager cache and installing Apache. Save and exit when you’re done.
You can create a temporary environment variable for the private key path as well:
- export DO_PRIVATE_KEY="private_key_location"
Note: The private key, and any other file that you wish to load from within Terraform, must be placed within the project. You can see the How To Configure SSH Key-Based Authentication on a Linux Server tutorial for more info regarding SSH key set up on Ubuntu 20.04 or other distributions.
Try applying the configuration again:
- terraform apply -var "do_token=${DO_PAT}" -var "domain_name=${DO_DOMAIN_NAME}" -var "private_key=${DO_PRIVATE_KEY}"
Enter yes
when prompted. You’ll receive output similar to before, but followed with long output from the remote-exec
provisioner:
Outputdigitalocean_droplet.web: Creating...
digitalocean_droplet.web: Still creating... [10s elapsed]
digitalocean_droplet.web: Still creating... [20s elapsed]
digitalocean_droplet.web: Still creating... [30s elapsed]
digitalocean_droplet.web: Provisioning with 'remote-exec'...
digitalocean_droplet.web (remote-exec): Connecting to remote host via SSH...
digitalocean_droplet.web (remote-exec): Host: ...
digitalocean_droplet.web (remote-exec): User: root
digitalocean_droplet.web (remote-exec): Password: false
digitalocean_droplet.web (remote-exec): Private key: true
digitalocean_droplet.web (remote-exec): Certificate: false
digitalocean_droplet.web (remote-exec): SSH Agent: false
digitalocean_droplet.web (remote-exec): Checking Host Key: false
digitalocean_droplet.web (remote-exec): Connected!
...
digitalocean_droplet.web: Creation complete after 1m5s [id=204442200]
digitalocean_record.www: Creating...
digitalocean_record.www: Creation complete after 1s [id=110666268]
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
You can now navigate to your domain in a web browser. You will see the default Apache welcome page.
This means that Apache was installed successfully, and that Terraform provisioned everything correctly.
To destroy the deployed resources, run the following command and enter yes
when prompted:
- terraform destroy -var "do_token=${DO_PAT}" -var "domain_name=${DO_DOMAIN_NAME}" -var "private_key=${DO_PRIVATE_KEY}"
You have now completed a small Terraform project with a simple structure that deploys the Apache web server on a Droplet and sets up DNS records for the desired domain.
You have learned about two general approaches for structuring your Terraform projects, according to their complexity. Following the simple structuring approach, and using the remote-exec
provisioner to execute commands, you then deployed a Droplet running Apache with DNS records for your domain.
For reference, here is the file structure of the project you created in this tutorial:
.
└── tf/
├── versions.tf
├── variables.tf
├── provider.tf
├── droplets.tf
├── dns.tf
├── data-sources.tf
└── external/
└── name-generator.py
The resources you defined (the Droplet, the DNS record and dynamic data source, the DigitalOcean provider and variables) are stored each in its own separate file, according to the simple project structure outlined in the first section of this tutorial.
For more information about Terraform provisioners and their parameters, visit the official documentation.
This tutorial is part of the How To Manage Infrastructure with Terraform series. The series covers a number of Terraform topics, from installing Terraform for the first time to managing complex projects.
]]>please help!
]]>IaaS cloud providers generally take on low-level infrastructure management responsibilities — like security, data partitioning, and backups. Unlike Platform as Service (PaaS) and Software as a Service (SaaS) categories of cloud computing, users have control over what infrastructure components they actually use and the software and tools they use with that infrastructure, like operating systems or development tools.
IaaS is a popular option for businesses that wish to leverage the advantages of the cloud and have system administrators who can oversee the installation, configuration, and management of the infrastructure that they wish to use. IaaS is also used by developers, researchers, and others who wish to customize the underlying infrastructure of their computing environment
For more educational resources related to IaaS, please visit:
A complete list of our educational resources on infrastructure can be found on our Infrastructure page.
]]>Infrastructure is often complex — large collections of components interacting to deliver a seamless product to end customers. The key to successful complex infrastructures is a continual focus on simplicity.
Chris Higgins is Vice President of Infrastructure at DigitalOcean. He has been a Unix geek since 1989, building the Internet since 1994, and building the cloud since 2008.
]]>Cloud computing provides on-demand computing resources, which are decoupled from physical hardware and necessary underlying configuration. Autonomous software systems provision these computing resources in the cloud to achieve the automation that cloud computing offers. Because of such automation, it’s possible to control and manipulate the available resources programmatically by interfacing with the cloud providers. This way, infrastructure changes (such as resource scaling) can be implemented more quickly and reliably and operated mostly without manual interaction, but still with the ability to oversee the whole process and revert changes if something does not go according to plan.
Infrastructure as Code (IaC) is the approach of automating infrastructure deployment and changes by defining the desired resource states and their mutual relationships in code. The code is written in specialized, human-readable languages of IaC tools. The actual resources in the cloud are created (or modified) when you execute the code. This then prompts the tool to interface with the cloud provider or deployment system on your behalf to apply the necessary changes, without using a cloud provider’s web interface. The code can be modified whenever needed—upon code execution the IaC tool will take care of finding the differences between the desired infrastructure in code and the actual infrastructure in the cloud, taking steps to make the actual state equal to the desired one.
For IaC to work in practice, created resources must not be manually modified afterward (an immutable infrastructure), as this creates discord between the expected infrastructure in code and the actual state in the cloud. In addition, the manually modified resources could get recreated or deleted during future code executions, and all such customization would be lost. The solution to this is to incorporate the modifications into the infrastructure code.
In this conceptual article, we’ll explore the IaC approach, its benefits, and examples of real-world implementations. We’ll also introduce Terraform, an open source IaC provisioning tool. We’ll review Terraform’s role in this approach and how it compares to other IaC tools.
With IaC, you can quickly create as many instances of your entire infrastructure as you need, in multiple provider regions, from a single source of truth: your declarative code. This has many advantages that ensures you’re creating resources consistently without error while reducing management and manual setup time.
The main benefits of IaC are:
Within an IaC workflow you can repeatedly spin up infrastructure in a standardized fashion, which means software development and testing is a quicker process because development, staging, quality-assurance testing, and production environments are separated. You can repeat the process of writing code and testing it live by deploying the infrastructure as many times as needed. Once your written infrastructure fulfills all requirements, you can deploy it in desired cloud environments. When new requirements arise, you can reiterate the process.
IaC, being based on code, should always be coupled with version control software (VCS), such as Git. Storing your infrastructure declarations in VCS makes it easily retrievable, with changes visible to everyone on your team, and provides snapshots at historical points, so you can always roll back to an earlier version if new modifications create errors. Advanced VCS can be configured to automatically trigger the IaC tool to update the infrastructure in the cloud when an approved change is added.
You now know what the IaC approach is, and what benefits it brings. You’ll now learn about states, the resource tracking mechanism employed in IaC. Then, you’ll follow up with the role of Terraform and other tools using IaC.
In an IaC environment, the term ‘state’ refers to the state of affairs of desired infrastructure resources in a deployment. There are at least three states at a given moment: the actual one in the cloud, the ideal state presented in code, and the cached state that the IaC tool maintains. The cached state describes the state in the cloud as it was when the code was last executed. Terraform allows you to deploy the same code multiple times, forming multiple states for each deployment.
The actual state in the cloud (of the managed resources) should always be the same as the cached state of the tool. When executing the code, the tool will compare the ideal state with the cached one and apply the detected differences to the cloud. If the cached and actual states do not match, it’s highly likely that the execution will fail or that resources will be incorrectly provisioned.
Terraform is an open source IaC resource provisioning tool, written in Go and developed by Hashicorp. It supports multiple cloud providers, including DigitalOcean. The infrastructure definitions are written in the Hashicorp Configuration Language (HCL), and source code files written in it have the file extension tf
.
Terraform works by reading the code describing your infrastructure and generating a graph containing all resources with their mutual relationships. It then compares it to the cached state of the resources in the cloud, and prepares an execution plan that details what will be applied to the cloud, and in what order, to reach the desired state.
The two main types of underlying components in Terraform are providers and provisioners. Providers are responsible for interacting with a given cloud provider, creating, managing, and deleting resources, while provisioners are used to execute specific actions on created remote resources or the local machine the code is executing on.
Terraform supports managing basic cloud provider components, such as compute instances, load balancers, storage, and DNS records, though more providers and provisioners can be added, owing to its extensible nature.
In IaC Terraform’s role is to ensure that the state of resources in the cloud is equal to the state expressed in code. It does not monitor the deployed resources, and its main focus is not on further bootstrapping of the provisioned compute instances with software and tasks. In the next section, you’ll learn how it compares to other tools and how they extend each other in a typical workflow.
The IaC approach is widespread in modern deployment, configuration management, virtualization, and orchestration software. Docker and Kubernetes, the leading tools used for container creation and orchestration, both use YAML as their language for declaring the desired end result. On the other hand, Hashicorp Packer, a tool for creating snapshots of deployments, uses JSON to declare the template and variables from which a snapshot of the system will be built.
Ansible, Chef, and Puppet, the three most popular configuration management tools, all use the IaC approach to define the desired state of the servers they manage.
Ansible bootstraps provided servers according to a given playbook. A playbook is a textual file written in appropriate YAML, instructing Ansible what operations to perform on the existing target resources. Examples of such operations include running and starting services, installing packages using the system-provided package manager, or executing custom bash commands. To learn more about writing Ansible playbooks, read Configuration Managment 101: Writing Ansible Playbooks.
Chef and Puppet both require central servers with agents installed on each of the managed ones. Unlike Ansible, Chef uses Ruby, and Puppet uses its own declarative language for describing the resources.
Terraform is not mutually exclusive with other IaC tools and DevOps systems. Its strength is in provisioning hardware resources, rather than further software installation and initial server setup.
Unlike configuration management tools such as Ansible and Chef, Terraform is not suitable for installing software on the target resources and setting up tasks. Instead Terraform offers providers for interacting with their supported resources.
Terraform can work from a single machine and does not require central servers with client agents installed on the provisioned resources, unlike some other tools. It does not check their actual state and reapplies the configuration automatically, because its main focus is on provisioning them. A typical workflow is to provision the infrastructure resources using Terraform and then bootstrap them using a configuration management tool, if needed.
For Chef, Terraform has a built-in provisioner that sets up its client agent on provisioned remote resources. With it you can automatically have all your provisioned servers added to the main server, from where you can additionally configure them using cookbooks, Chef’s infrastructure declarations. You can learn more about writing them in Configuration Management 101: Writing Chef Recipes.
This article covered the paradigms of the IaC approach, its advantages over traditional manual system administration, the basics of Terraform as an IaC resource provisioning tool, and how it compares to other popular infrastructure automation tools.
If you’re looking to incorporate Infrastructure as Code into your workflow, check out our Terraform series to learn the fundamentals of using this tool in your development and deployment process.
One way to start with Terraform is to read How To Structure Your Terraform Project to understand how to ensure your infrastructure stays scalable and extensible.
]]>