Table of Contents
Understanding Cloud Provisioning with Terraform
Cloud provisioning with Terraform means using Terraform to create, update, and delete real resources in a cloud provider such as AWS, Azure, or Google Cloud Platform. In this chapter you focus on how Terraform interacts with cloud APIs, how to structure your configuration for cloud infrastructure, and how to apply safe workflows for provisioning and changing resources.
You already know the basic concepts of Terraform, such as providers, resources, and modules, from earlier chapters. Here you apply those concepts specifically to cloud infrastructure.
How Terraform Talks to the Cloud
Terraform does not run inside the cloud provider. It runs on your machine or in a CI system and talks to the cloud provider’s API. When you write a resource block for a virtual machine, network, or database, Terraform turns that configuration into API calls.
For example, with AWS, Terraform uses the AWS provider and calls AWS APIs such as RunInstances or CreateVpc. With Azure, it calls the corresponding Azure Resource Manager APIs, and with GCP, the Google Cloud APIs. In all cases the pattern is the same. Terraform reads your configuration, calculates a desired state, compares it with the current state, and then sends API requests to move the real infrastructure toward that desired state.
The key point is that your Terraform code describes what you want, not how to call the cloud API step by step. Terraform manages the order of operations based on dependencies between resources.
Authenticating Terraform to a Cloud Provider
Before Terraform can create anything in the cloud, you must give it credentials. These credentials are not stored in the Terraform configuration itself. Instead you normally provide them as environment variables or external configuration files that the provider uses.
With AWS, the provider looks for environment variables such as AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, or it uses the standard AWS credentials file. With Azure, you can authenticate using a service principal and environment variables such as ARM_CLIENT_ID, ARM_CLIENT_SECRET, ARM_TENANT_ID, and ARM_SUBSCRIPTION_ID. With GCP, you commonly use a JSON service account key file and point Terraform to it using an environment variable such as GOOGLE_APPLICATION_CREDENTIALS.
Never hard code long lived cloud credentials directly in Terraform .tf files or commit them to version control. Always use environment variables, short lived tokens, or secrets managers to protect credentials.
When you run terraform init, the provider plugin is downloaded. During terraform plan and terraform apply, the provider uses the credentials that are currently available in your environment to authenticate with the cloud API.
Typical Cloud Resources in Terraform
Cloud provisioning typically involves several types of resources that work together. You can think of these as layers. At the bottom you have networks, above that security policies, and above those compute and managed services.
In AWS you might provision a VPC for networking, subnets for public and private zones, security groups or network ACLs for access rules, and then EC2 instances or higher level services such as RDS databases and load balancers. In Azure you work with resource groups, virtual networks, subnets, network security groups, virtual machines, and managed services such as Azure SQL. In GCP you interact with VPC networks, subnets, firewall rules, compute instances, and managed services such as Cloud SQL.
Terraform represents each of these as a separate resource block. Instead of clicking through a web console and entering data into forms, you write declarative configuration that records the desired settings. This configuration becomes the single source of truth for your cloud infrastructure.
Dealing with Regions, Availability Zones, and Projects
Every cloud provider has a concept of region. Many also have availability zones inside each region. In addition, providers group resources into accounts, subscriptions, or projects.
In Terraform, you normally set the region or location in the provider configuration. For AWS, you can specify the region argument to the provider, or rely on an environment variable such as AWS_REGION. For Azure, you choose a location for each resource. For GCP, you set a region or zone per resource as needed.
At a higher level, Terraform configurations can be organized per account, subscription, or project. For example, you might have one Terraform workspace or directory that manages the staging environment and another that manages production. Each environment points to a different AWS account, Azure subscription, or GCP project through its provider configuration and credentials.
Keeping regions and environments explicit in variables helps avoid accidental creation of resources in the wrong place. You might define variables such as var.region or var.project and pass them in for each environment.
Safe Workflows with Plan and Apply
Cloud provisioning is powerful and also potentially destructive. A wrong configuration can delete or replace critical resources. Terraform helps reduce this risk with its plan and apply workflow.
You generate a plan with the terraform plan command. This produces a detailed preview of which resources will be created, changed, or destroyed. The plan only shows what Terraform intends to do. It does not make changes yet. You review it carefully, confirm that the changes are expected, and only then run terraform apply to execute the plan.
In automation systems you often run terraform plan first and store its output as a plan file. A human can review and approve it, and then a later step runs terraform apply with that specific plan file. This pattern gives you an additional safety check for sensitive environments such as production.
Always review the terraform plan output before you run terraform apply on real cloud environments. Pay particular attention to any resources scheduled for destruction or replacement.
Working with State in a Cloud Context
Terraform keeps a state file that records what it has created. When you run plan or apply, Terraform compares the configuration with this state, and with the real state in the cloud, to decide what to change.
For cloud infrastructure you rarely keep state local for long. Instead you store it remotely, for example in an S3 bucket for AWS, in Azure Blob Storage, or in a GCS bucket. This is configured through a backend block in your Terraform configuration. A remote backend lets multiple users or CI systems share the same view of the infrastructure state and reduces the risk of inconsistent changes.
The backend can also provide state locking so that only one apply runs at a time. This prevents two people from concurrently modifying the same infrastructure with conflicting changes.
When you work with multiple environments or cloud accounts you usually use separate state files. Each environment might have its own backend configuration so that development, testing, and production do not share state and cannot accidentally overwrite each other.
Managing Changes and Drift in the Cloud
Infrastructure changes over time. You might add more instances, adjust firewall rules, or upgrade a database. Sometimes people change resources manually in the cloud console instead of through Terraform. This creates drift between the configuration and the real environment.
When Terraform runs plan, it queries the cloud provider to obtain the real state. If someone has changed a resource manually, Terraform sees that the real properties do not match the configuration. Depending on the difference, the plan might propose an in place update or a full replacement of the resource.
This makes Terraform useful not only for provisioning but also for detecting configuration drift. You can treat manual changes in the console as exceptions that need to be corrected. Some teams treat any non Terraform change as a defect and rely on Terraform to bring the infrastructure back into line with the configuration.
In some situations, Terraform cannot fully detect or control certain aspects of resources. Providers mark some attributes as read only or computed. For stable automation you should design your infrastructure so that Terraform remains the primary way to change it, and avoid manual changes wherever possible.
Multi Cloud and Hybrid Scenarios
Terraform can manage multiple providers in a single configuration. This allows multi cloud scenarios where some resources live in AWS, some in Azure, and others in GCP, or hybrid setups where Terraform handles both cloud and on premises resources.
In practice, you can declare several provider blocks and then reference them from specific resources. Your configuration can then wire together elements from different clouds, for example by creating DNS records in one provider that point to load balancers in another.
While this is technically possible, you should consider the operational complexity. Each provider has separate credentials, regions, and networking models. In many organizations, separate Terraform configurations or modules are used per cloud, with a higher level process to coordinate them.
Using Modules for Reusable Cloud Patterns
Cloud provisioning frequently involves repeating the same patterns. You might repeatedly create a network with public and private subnets, an application load balancer, and a group of instances behind it. Instead of re writing this configuration each time, you can package it as a module.
A module abstracts a cloud pattern into a reusable unit with inputs and outputs. Inputs might include network CIDR ranges, instance types, or allowed IP ranges. Outputs might include the IDs of subnets or the DNS name of a load balancer. You then call this module from different environments, supplying different variable values for each case.
This approach helps standardize infrastructure and makes it easier to provision new environments. For example, a development environment can use the same module as production but pass smaller instance sizes and fewer instances, while keeping the same structure.
In larger organizations, infrastructure teams maintain a catalog of approved modules for common patterns. Application teams then assemble these modules to build their services in the cloud.
Handling Sensitive Values and Outputs
Cloud resources often require sensitive data, such as database passwords, API keys, or private keys. Terraform configuration should never print or store such data in a way that exposes it by default. Terraform supports sensitive variables and sensitive outputs that are redacted in CLI output.
However, the state file can still contain these values. This is one more reason to protect the backend that stores state. Use access controls and encryption on the backend storage. Combine Terraform with secret management systems so that long lived secrets do not stay in plain text in configuration or state.
When provisioning in the cloud, many teams generate credentials outside of Terraform and inject only references to them, for example through an ID in a secrets manager. Terraform then uses that ID to wire resources to the secret, but never sees its value directly.
Provisioning Strategies for Different Environments
You rarely provision everything in the same way for all environments. Development and test environments might be ephemeral and replaced frequently. Production environments are long lived and must be changed carefully.
With Terraform you can handle this by splitting environments into different directories or workspaces, using the same modules but with different variables. You may also define different backends so that each environment has independent state.
For ephemeral environments you can use terraform destroy to remove resources when they are no longer needed. In a continuous integration pipeline, you might spin up an entire test stack for each run, execute integration tests, and then destroy the stack afterward. For production environments you usually avoid frequent full destroys and instead rely on incremental updates that preserve stateful components like databases.
Cloud provisioning with Terraform becomes more powerful as you combine these strategies. You can have repeatable, fully documented infrastructure for every environment, controlled from versioned code, while still respecting the different lifecycles and risk profiles of each environment.