Table of Contents
Understanding Linux on Google Cloud Platform
Running Linux on Google Cloud Platform (GCP) largely means working with Linux-based virtual machines and managed services that themselves run on Linux behind the scenes. This chapter focuses on how Linux is used specifically in GCP, and what is different compared to other clouds.
GCP’s Linux Building Blocks
Google Compute Engine (GCE) Linux VMs
The core place you explicitly run Linux on GCP is Google Compute Engine:
- VM instances: Virtual machines running Linux distributions.
- Machine families under
e2,n2,c2,t2d, etc., affect performance and cost, but the Linux usage is the same. - Images: Prebuilt OS images; for Linux you will commonly see:
- Debian (Google’s default in many examples)
- Ubuntu
- CentOS / Rocky Linux / RHEL
- SUSE / openSUSE
- Container-Optimized OS (Google’s minimal OS for containers)
- Custom images (your own Linux builds)
Linux on GCE is typically accessed via SSH, often using your local terminal or the browser-based SSH client in the console.
Linux in GCP Managed Services
Many managed services on GCP run on Linux under the hood, even if you do not manage the OS directly:
- Google Kubernetes Engine (GKE): Linux nodes (unless you choose Windows node pools).
- Cloud Run / Cloud Functions: Run containers and functions based on Linux environments.
- App Engine Flexible: Linux containers.
- Cloud SQL, Memorystore, and others: Linux OS is abstracted but you interact as if you were talking to a Linux-hosted service.
This chapter focuses on when you do manage Linux directly (mainly Compute Engine), plus some GCP-specific tools and patterns around it.
Creating and Managing Linux VMs on GCE
Choosing a Linux Image
When creating a VM:
- In the console, under Boot disk, you pick:
- OS family (e.g., Debian, Ubuntu, Rocky Linux, etc.)
- Version (e.g., Debian 12, Ubuntu 22.04)
- Disk type (Standard persistent, Balanced, or SSD)
- From the
gcloudCLI, you use flags like: --image-family=debian-12--image-project=debian-cloud
Each image comes with GCP agents and basic tools for integration (metadata, logging, etc.) in most official images.
Basic VM Creation with `gcloud`
From your local shell or Cloud Shell:
gcloud compute instances create my-linux-vm \
--zone=us-central1-a \
--machine-type=e2-micro \
--image-family=debian-12 \
--image-project=debian-cloud \
--boot-disk-size=20GBKey Linux-specific considerations:
- Disk size and type will directly affect I/O performance of your Linux filesystem.
- The base image determines default packages,
systemdservices, and available tools.
Connecting via SSH
GCP strongly encourages key-based SSH. Typical methods:
- Cloud Console:
- Click your VM →
SSHbutton opens a browser-based terminal. - Cloud Shell:
- Runs a Linux container in your browser, and can SSH without local key setup.
- Local
gcloud:
gcloud compute ssh my-linux-vm --zone=us-central1-a- Direct
ssh:
Once keys and firewall rules are configured:
ssh USER@EXTERNAL_IP
GCP manages SSH keys via instance and project metadata; when you connect via gcloud compute ssh, it will:
- Generate a key pair (if you don’t have one).
- Add your public key to the instance/project metadata.
- Use it to authenticate.
On the Linux VM, these keys show up in the user’s ~/.ssh/authorized_keys.
Using Startup Scripts (Linux Metadata)
Compute Engine can run custom scripts on Linux during boot using metadata:
- Instance metadata key:
startup-script - Custom script examples:
# Simple Debian/Ubuntu example
#! /bin/bash
apt-get update -y
apt-get install -y nginx
systemctl enable nginx
systemctl start nginxAttach via:
gcloud compute instances create web-vm \
--metadata-from-file startup-script=./startup.shThis is a lightweight way to bootstrap Linux instances (install packages, configure services) without full-blown configuration management.
GCP-Specific Linux Agents and Tools
Guest Environment and Metadata Access
Linux VMs on GCP often include the Google guest environment, providing:
- Access to metadata server: available at
http://169.254.169.254/computeMetadata/v1/ - Tools like
google_guest_agent(service names vary slightly by distro)
Example: Query hostname from inside Linux:
curl -H "Metadata-Flavor: Google" \
http://169.254.169.254/computeMetadata/v1/instance/hostnameTypical metadata uses:
- Discover instance name, zone, project from Linux.
- Fetch custom metadata values you define (e.g.,
app_env=prod). - Support dynamic configuration scripts.
Logging and Monitoring Agents
To send Linux logs and metrics to GCP:
- Ops Agent (recommended newer agent) on Linux:
- Collects system logs (
/var/log/syslog,/var/log/messages, etc.). - Collects metrics: CPU, memory, disk, network.
- Can be configured via YAML to collect application logs.
Install example (Debian/Ubuntu):
curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
sudo bash add-google-cloud-ops-agent-repo.sh --also-install
Then configure log/metric pipelines using config files under /etc/google-cloud-ops-agent/config.yaml.
This effectively connects your Linux system internals to Cloud Logging and Cloud Monitoring.
Networking and Firewalls for Linux VMs
VPC and Internal Addresses
Each Linux VM gets:
- An internal IP in a VPC subnet.
- Optionally, an external IP for internet access.
From inside Linux, usual networking commands (ip, ss, ping) behave as on any other system, but the network topology is controlled by VPC settings.
Google Cloud Firewalls vs Linux Firewalls
On GCP, there are two layers:
- VPC firewall rules:
- Managed at the project/VPC level.
- Applied based on tags, network, and IP ranges.
- Example: allow TCP 22 (SSH) from your IP only.
- Linux host firewall (e.g.,
iptables,nftables,ufw,firewalld): - Runs inside the VM.
- Controls traffic after it reaches the VM.
Common patterns:
- Use VPC firewalls as the main gate (e.g., allow inbound 22, 80, 443).
- Optionally harden further with Linux firewall rules for defense-in-depth.
External Access and SSH Security
For Linux on GCP, consider:
- Restricting SSH access:
- Limit
0.0.0.0/0rules; use your IP or VPN IP ranges. - Disable password SSH auth in
/etc/ssh/sshd_config(key-based only). - Using IAP (Identity-Aware Proxy) for SSH:
- Access VMs via IAP tunneling, avoiding a public IP entirely.
- Works via
gcloud compute ssh --tunnel-through-iap.
IAP means your Linux VM may only have an internal IP; SSH flow goes through GCP and IAM-controlled access, not open internet.
Storage on Linux VMs in GCP
Persistent Disks
Linux VMs typically use Persistent Disks (PD) as block devices:
- Standard, Balanced, and SSD options.
- Appear as
/dev/sdXor/dev/nvmeXnYinside Linux. - You format and mount them using normal Linux tools.
Example: Attaching and mounting an extra disk on Linux:
- Create and attach disk (
gcloudor console). - On the VM:
lsblk # find the new device, e.g., /dev/sdb
sudo mkfs.ext4 /dev/sdb
sudo mkdir /data
sudo mount /dev/sdb /data- For automatic mounting, add to
/etc/fstabusing the disk’s UUID:
UUID=xxxx-xxxx /data ext4 defaults 0 2Snapshots and Images from Linux Disks
GCP can snapshot disks backing a Linux VM:
- Snapshots are crash-consistent by default.
- For critical systems (databases), coordinate from Linux:
- Flush filesystem buffers (
sync). - Use filesystem freeze (
fsfreeze) where applicable. - Or put services into a consistent state before snapshot.
Disk snapshots and images are managed from GCP, but the consistency and application state are controlled from within the Linux OS.
Using Containers and Kubernetes (Linux-Focused) on GCP
While there are separate chapters for containers and Kubernetes, here is how Linux plays into GCP’s container offerings.
Linux Container Hosts on GCP
Use Container-Optimized OS (COS) or Ubuntu as minimal, hardened Linux for:
- Single-container workloads on Compute Engine.
- Running Docker/Podman manually.
- Custom Kubernetes clusters.
Typical pattern:
- Create a VM from a COS or Ubuntu image.
- Install Docker/Podman (if not using COS, which comes with container runtime).
- Deploy your containerized applications using Linux tools (systemd units, Docker Compose, etc.).
GKE Node OS
GKE node pools use Linux images such as:
- COS (default, optimized for containers)
- Ubuntu-based nodes (for more flexibility/customization)
You do not usually manage these nodes like ordinary Linux servers, but:
- You might
sshinto nodes for debugging. - Linux concepts (cgroups, namespaces, iptables) still apply under the hood.
Identity and Access Management for Linux on GCP
IAM vs Linux Users
There are two identity layers:
- GCP IAM:
- Controls who can create/modify/delete VMs, connect with IAP, etc.
- Users are Google accounts or service accounts.
- Linux user accounts:
- Defined in
/etc/passwd,/etc/group. - Control what a user can do once logged into the VM (
sudo, file ownership, etc.).
Typical workflow:
- Grant a developer IAM roles like:
roles/compute.instanceAdmin.v1roles/compute.osLogin- Use OS Login to map IAM users to Linux accounts.
OS Login Integration
OS Login lets IAM control Linux login access:
- Instead of managing SSH keys in instance metadata, keys are stored with the user’s Google account.
- When enabled, Linux user accounts like
usernameare tied to IAM identities.
From an admin perspective:
- Enable OS Login via metadata:
- Project metadata:
enable-oslogin=TRUE - Manage login privileges via IAM roles:
roles/compute.osLoginroles/compute.osAdminLogin(includessudoaccess)
On the Linux VM, sshd uses PAM modules to integrate with OS Login so that IAM decisions determine whether a user can log in and what groups/privileges they get.
Automating Linux Workloads on GCP
Using `gcloud` and Cloud Shell
For automation, you often:
- Use
gcloudcommands from: - A CI pipeline
- Developer workstation
- Cloud Shell (Linux environment managed by GCP)
- Combine
gcloudwith shell scripting to: - Spin up / tear down Linux VMs.
- Run startup scripts.
- Configure firewall rules.
Example script snippet:
#!/usr/bin/env bash
set -e
PROJECT_ID=my-gcp-project
ZONE=us-central1-a
NAME=web-$(date +%s)
gcloud config set project "$PROJECT_ID"
gcloud compute instances create "$NAME" \
--zone="$ZONE" \
--machine-type=e2-small \
--image-family=debian-12 \
--image-project=debian-cloud \
--metadata-from-file startup-script=./web-startup.shLinux scripting skills directly translate to orchestrating GCP resources.
Service Accounts from Linux
For Linux-based applications that need to call GCP APIs:
- Attach a service account to the VM.
- Use the metadata server to obtain an access token:
TOKEN=$(curl -H "Metadata-Flavor: Google" \
"http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token" \
| jq -r .access_token)
curl -H "Authorization: Bearer $TOKEN" \
"https://www.googleapis.com/storage/v1/b"Or use Google Cloud client libraries on Linux, which automatically fetch tokens when running on a VM with a service account.
Best Practices for Linux on GCP
- Prefer OS Login over manually managed SSH keys.
- Limit public IP exposure; use IAP or VPN where possible.
- Use Ops Agent for logging and monitoring Linux instances centrally.
- Tag instances and use labels to organize Linux VMs by environment, role, or team.
- Use instance templates and managed instance groups to scale stateless Linux workloads instead of manually managing many VMs.
- Combine Linux configuration management tools (Ansible, Puppet, Chef, etc.) with GCP features (metadata, instance groups) for reproducible infrastructure.
By understanding how Linux integrates with Compute Engine, IAM, networking, and storage on GCP, you can design systems that use familiar Linux tools while benefiting from GCP’s automation, scalability, and managed services.