6.4.4 Cloud provisioning with Terraform

Table of Contents

Why Use Terraform for Cloud Provisioning?

Using Terraform for cloud provisioning means using code (HCL) to create and manage cloud resources in a repeatable, version-controlled way. In this chapter, the focus is specifically on using Terraform with real cloud providers (AWS, Azure, GCP), not on generic Terraform features or HCL syntax that were covered earlier.

Key benefits specific to cloud provisioning:

Consistency across environments: same Terraform code can create dev, staging, and production stacks.
Multi-cloud: manage AWS, Azure, GCP (and more) from one tool.
Drift detection: terraform plan shows changes compared to what’s actually in the cloud.
Safe changes: terraform computes dependency graphs, creates/destroys resources in the right order.

Basic Cloud Provisioning Workflow

The high-level workflow doesn’t change much between clouds:

Configure the provider (credentials, region, etc.).
Declare resources (VMs/instances, networks, storage, etc.).
Initialize the working directory: terraform init
Preview changes: terraform plan
Apply the configuration: terraform apply
Destroy resources when they’re no longer needed: terraform destroy

The details—provider blocks, resource types, and arguments—differ per cloud.

Authenticating Terraform to the Cloud

Each cloud has several authentication methods. The most important rule: prefer official, tool-friendly methods over hardcoding credentials in HCL.

Typical patterns:

Environment variables (most portable, good for CI):

AWS: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, optionally AWS_SESSION_TOKEN, AWS_PROFILE
Azure: ARM_CLIENT_ID, ARM_CLIENT_SECRET, ARM_SUBSCRIPTION_ID, ARM_TENANT_ID
GCP: GOOGLE_APPLICATION_CREDENTIALS pointing to a JSON key file

Local CLI credentials:

AWS: aws configure → Terraform uses shared credentials file.
Azure: az login → Terraform can use CLI token.
GCP: gcloud auth application-default login

Instance/VM roles:

AWS: IAM Roles for EC2
Azure: Managed Identities
GCP: Service Account attached to VM

For real projects, combine:

Short-lived roles/tokens where possible.
Remote state backends with proper access control (e.g., S3+IAM, GCS+IAM, Azure Storage+RBAC).

Minimal Cloud Examples

The goal of these examples is to show how Terraform translates into concrete cloud resources for a very small “hello world” infrastructure. They omit many options on purpose.

AWS: Provisioning a Simple EC2 Instance

Prerequisites:

AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY set, or a configured AWS profile.
An SSH key pair already created in AWS (or you create it via Terraform).

Example layout:

# main.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
  required_version = ">= 1.5.0"
}
provider "aws" {
  region = "us-east-1"
}
# 1. Create a security group allowing SSH
resource "aws_security_group" "web_sg" {
  name        = "example-web-sg"
  description = "Allow SSH inbound"
  ingress {
    description = "SSH"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"] # For demos only; not safe for production
  }
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}
# 2. Create an EC2 instance
resource "aws_instance" "web" {
  ami                         = "ami-0c02fb55956c7d316" # Example Amazon Linux 2 AMI; region-specific
  instance_type               = "t3.micro"
  vpc_security_group_ids      = [aws_security_group.web_sg.id]
  associate_public_ip_address = true
  key_name                    = "my-keypair"           # Must exist in AWS
  tags = {
    Name = "example-web"
  }
}
# 3. Output the instance public IP
output "instance_public_ip" {
  value = aws_instance.web.public_ip
}

Workflow:

terraform init
terraform plan
terraform apply
Use the output IP to SSH: ssh ec2-user@<ip> -i /path/to/key.pem
When done: terraform destroy

This example shows:

Provider configuration (provider "aws").
A network resource (aws_security_group).
A compute resource (aws_instance).
Outputs used to reveal connection details.

Azure: Provisioning a Linux Virtual Machine

Prerequisites:

An Azure subscription.
az login completed, or service principal environment variables set.

Typical minimal setup (real Azure VM configs are more verbose):

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
  }
}
provider "azurerm" {
  features {}
}
# 1. Resource group (logical container)
resource "azurerm_resource_group" "rg" {
  name     = "rg-terraform-example"
  location = "eastus"
}
# 2. Virtual network and subnet
resource "azurerm_virtual_network" "vnet" {
  name                = "vnet-example"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
}
resource "azurerm_subnet" "subnet" {
  name                 = "subnet-example"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = ["10.0.1.0/24"]
}
# 3. Public IP and NIC
resource "azurerm_public_ip" "public_ip" {
  name                = "pip-example"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  allocation_method   = "Dynamic"
}
resource "azurerm_network_interface" "nic" {
  name                = "nic-example"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  ip_configuration {
    name                          = "internal"
    subnet_id                     = azurerm_subnet.subnet.id
    private_ip_address_allocation = "Dynamic"
    public_ip_address_id          = azurerm_public_ip.public_ip.id
  }
}
# 4. Linux VM
resource "azurerm_linux_virtual_machine" "vm" {
  name                  = "vm-example"
  location              = azurerm_resource_group.rg.location
  resource_group_name   = azurerm_resource_group.rg.name
  size                  = "Standard_B1s"
  admin_username        = "azureuser"
  network_interface_ids = [azurerm_network_interface.nic.id]
  admin_ssh_key {
    username   = "azureuser"
    public_key = file("~/.ssh/id_rsa.pub")
  }
  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }
  source_image_reference {
    publisher = "Canonical"
    offer     = "0001-com-ubuntu-server-focal"
    sku       = "20_04-lts"
    version   = "latest"
  }
}
output "vm_public_ip" {
  value = azurerm_public_ip.public_ip.ip_address
}

This illustrates:

Azure’s emphasis on resource groups and networking primitives.
Chaining resources together via id references.
Using file("~/.ssh/id_rsa.pub") to inject your SSH key.

GCP: Provisioning a Compute Engine Instance

Prerequisites:

A GCP project.
A service account JSON key, with GOOGLE_APPLICATION_CREDENTIALS set.
gcloud and project/region/zone configured (optional but helpful).

Example:

terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 6.0"
    }
  }
}
provider "google" {
  project = "my-gcp-project-id"
  region  = "us-central1"
  zone    = "us-central1-a"
}
# Simple Compute Engine instance
resource "google_compute_instance" "vm" {
  name         = "tf-example-vm"
  machine_type = "e2-micro"
  zone         = "us-central1-a"
  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11"
    }
  }
  network_interface {
    network       = "default"
    access_config {} # Needed for external IP
  }
  metadata = {
    ssh-keys = "gcpuser:${file("~/.ssh/id_rsa.pub")}"
  }
  labels = {
    environment = "demo"
  }
}
output "vm_external_ip" {
  value = google_compute_instance.vm.network_interface[0].access_config[0].nat_ip
}

Notable points:

The provider "google" block ties Terraform to a specific project/region/zone.
access_config {} on a network interface is how you request a public IP.
metadata.ssh-keys is a simple way to enable SSH access via your key.

Managing State for Cloud Environments

Cloud provisioning quickly outgrows local terraform.tfstate files. For meaningful cloud use, you almost always want a remote backend:

Common patterns:

AWS: S3 + DynamoDB for locking
Azure: Azure Storage + container
GCP: GCS bucket

Example: AWS remote state backend:

terraform {
  backend "s3" {
    bucket         = "my-tf-state-bucket"
    key            = "prod/network/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

This enables:

Shared state between multiple users/CI.
State locking to prevent concurrent apply operations.

Configuration of backends is usually done once per project and then reused.

Structuring Terraform for Multiple Environments

When provisioning cloud infrastructure, you typically need dev, staging, and production. For beginners, two simple patterns suffice:

Pattern 1: Directory Per Environment

Example structure:

terraform/
  modules/
    network/
      main.tf
      variables.tf
      outputs.tf
    compute/
      main.tf
      variables.tf
      outputs.tf
  envs/
    dev/
      main.tf
      backend.tf
      dev.tfvars
    prod/
      main.tf
      backend.tf
      prod.tfvars

modules/ contains reusable building blocks (e.g., “a VPC with subnets”, “a VM plus security group”).
Each environment has its own backend config and variable values.
You run Terraform from within each environment directory.

Pattern 2: Workspaces (For Simpler Cases)

Workspaces allow reusing the same configuration for multiple instances (e.g., default, dev, prod) with workspace-aware naming.

Caveats:

State separation is implicit; backends don’t show separate files per workspace.
Good for simple demos, but many teams eventually prefer explicit directory separation or separate repos.

Provisioning Common Cloud Building Blocks

Beyond a single VM, real-world Terraform usage involves assembling compositions of typical cloud components. At a beginner level, you should be comfortable with:

Networking

Expect these primitives:

AWS: aws_vpc, aws_subnet, aws_internet_gateway, aws_route_table, aws_security_group
Azure: azurerm_virtual_network, azurerm_subnet, azurerm_network_security_group, azurerm_network_security_rule
GCP: google_compute_network, google_compute_subnetwork, google_compute_firewall

Pattern:

Create a virtual network (VPC/vNet/Network).
Create one or more subnets.
Create firewall/security rules to allow SSH/HTTP/etc.
Attach compute resources to subnets and security groups.

Compute + Storage

Typical resources:

AWS: aws_instance, aws_ebs_volume, aws_ebs_volume_attachment
Azure: azurerm_linux_virtual_machine, azurerm_managed_disk
GCP: google_compute_instance, google_compute_disk

Look out for:

The cloud’s image lookup method (AMIs, image references, image families).
How disks are declared (boot vs data disks).
How to attach additional storage.

Load Balancers and Managed Services (High-Level View)

Even if you don’t fully implement them yet, know that Terraform can manage:

Load balancers:

AWS: aws_lb (ALB/NLB), aws_lb_target_group, aws_lb_listener
Azure: azurerm_lb, azurerm_application_gateway
GCP: various google_compute_* load-balancer resources

Managed databases:

AWS RDS: aws_db_instance, aws_rds_cluster
Azure Database: azurerm_mysql_flexible_server, etc.
GCP Cloud SQL: google_sql_database_instance

These follow the same pattern: provider block → resource declarations → outputs.

Using Variables and Outputs for Cloud Reuse

For cloud provisioning, variables and outputs are essential for making configurations reusable and composable.

Minimal practical rules:

Use variables for:

Regions / locations.
Instance types / sizes.
Environment names (dev, prod).
CIDR blocks.

Use outputs to:

Export public IPs, DNS names, or IDs of resources.
Feed values into other modules or other tools (e.g., Ansible).

Example (generic pattern):

# variables.tf
variable "region" {
  description = "Cloud region"
  type        = string
  default     = "us-east-1"
}
variable "instance_type" {
  description = "Instance size"
  type        = string
  default     = "t3.micro"
}
# main.tf (AWS example fragment)
provider "aws" {
  region = var.region
}
resource "aws_instance" "web" {
  instance_type = var.instance_type
  # ...
}
# outputs.tf
output "web_public_ip" {
  value = aws_instance.web.public_ip
}

Cloud provisioning almost always ends up parameterized this way so you can adjust cost, performance, and regions without rewriting resources.

Integrating Terraform with DevOps Workflows

Since this course lives in a DevOps and Cloud part, it’s worth highlighting how cloud provisioning with Terraform fits in a broader workflow:

Version control:

Store your Terraform configuration in Git.
Use branches/merge requests to review infrastructure changes.

CI/CD pipelines:

Run terraform fmt and terraform validate on every change.
Run terraform plan and capture the plan as an artifact.
Optionally require human approval before terraform apply.

Separation of responsibilities:

Infrastructure team maintains base networking, IAM, shared services.
Application teams may have their own Terraform modules or workspaces for app-specific infrastructure.

State security:

Restrict direct write access to state backends.
Use service accounts/roles for CI pipelines.

Practical Tips and Common Pitfalls

When you start using Terraform for real cloud provisioning, a few recurring issues appear:

Accidental deletions:

Terraform will destroy resources removed from configuration.
Use lifecycle { prevent_destroy = true } on critical resources (e.g., production databases).

Changing resource names:

Renaming resources in code can cause them to be destroyed and recreated.
Use terraform state mv if you must rename and keep the resource.

Manual changes in the console:

Console changes cause drift. terraform plan will either revert them or report differences.
Prefer keeping all important configuration in Terraform.

API limits and quotas:

Providers will fail if you hit resource quotas (e.g., too many IPs).
Start small and clean up with terraform destroy to avoid unnecessary usage and cost.

Credential management:

Never commit keys into Git.
Use environment variables, local config files, or secret managers.

Where to Go Next

After understanding basic cloud provisioning with Terraform, the natural next steps are:

Refactor simple configs into modules shared across environments.
Introduce remote state backends and state locking in real teams.
Add policy controls (e.g., Sentinel or external scanners) to guardrails your cloud usage.
Combine Terraform with configuration management tools (e.g., Ansible) for post-provisioning OS configuration on your cloud VMs.

Comments

Please login to add a comment.

Don't have an account? Register now!

6.4.4 Cloud provisioning with Terraform

Why Use Terraform for Cloud Provisioning?

Basic Cloud Provisioning Workflow

Authenticating Terraform to the Cloud

Minimal Cloud Examples

AWS: Provisioning a Simple EC2 Instance

Azure: Provisioning a Linux Virtual Machine

GCP: Provisioning a Compute Engine Instance

Managing State for Cloud Environments

Structuring Terraform for Multiple Environments

Pattern 1: Directory Per Environment

Pattern 2: Workspaces (For Simpler Cases)

Provisioning Common Cloud Building Blocks

Networking

Compute + Storage

Load Balancers and Managed Services (High-Level View)

Using Variables and Outputs for Cloud Reuse

Integrating Terraform with DevOps Workflows

Practical Tips and Common Pitfalls

Where to Go Next

Comments

Where to Move