Table of Contents
Understanding Cloud Computing in a Linux Context
Cloud computing is mostly built on Linux. As a Linux user or admin, the cloud changes how you run systems, not what Linux is. This chapter focuses on what is specific about using Linux in cloud environments, regardless of provider.
Key ways cloud Linux differs from traditional servers:
- You usually don't install the OS manually; you choose an image.
- Servers are ephemeral: you create, destroy, and recreate them often.
- Almost everything is controlled via APIs and automation.
- Networking, storage, and identity are provided by the cloud platform, not by you directly.
We’ll keep things provider‑neutral here; later chapters will cover AWS/Azure/GCP specifically.
Linux in the Cloud: Core Concepts
Compute Instances (Virtual Machines)
In the cloud, Linux usually runs as a virtual machine (VM):
- Providers offer Linux images (Ubuntu, Debian, Rocky, etc.).
- You choose:
- Instance type (vCPU, RAM)
- Attached storage (size, type)
- Network settings
- You connect primarily using SSH, often with key-based authentication.
Conceptually, this is similar to running Linux on KVM/VirtualBox, but:
- Instances are created via API (web console, CLI, Terraform, etc.).
- Hardware is abstracted and shared; you don’t see the hypervisor.
- You pay for runtime, not hardware purchase.
Cloud vs Traditional Linux Servers
Common differences:
- Identity: instances are often identified by metadata (instance ID, tags) instead of fixed hostnames and IPs.
- Configuration: big emphasis on cloud-init, configuration management (Ansible, Puppet), and images, not manual tweaking.
- Network: IPs can be dynamic, public IPs can be attached/detached, and firewalls are often security groups managed by the provider.
- Storage: root disks are typically virtual block devices, plus optional network-attached volumes.
Practically, you still log in and see a familiar Linux: systemd, /etc, /var/log, etc. The difference is how it gets there and how you manage it.
Linux Images and Provisioning
Linux Images
A cloud image is a prebuilt disk image containing:
- A specific distribution and version
- Default packages and settings
- Cloud integration tools:
cloud-init(very common)- Agent daemons (for monitoring, metrics, etc. depending on provider)
You typically choose among:
- Distribution images (Ubuntu Server, AlmaLinux, Debian, etc.)
- Vendor-customized images (preinstalled agents, hardened configs)
- Community or marketplace images (application stacks, appliances)
- Custom images you build yourself (via Packer, image builder tools)
Key Linux aspects to be aware of:
- Default user (e.g.
ubuntu,ec2-user,cloud-user), which you SSH into. - Disk layout (single root, separate
/var, LVM, etc.). - Init system (almost always
systemdnow, but images can differ).
Cloud-Init Basics for Linux
cloud-init is a standard tool preinstalled on many cloud images:
- Runs on first boot and sometimes on every boot.
- Reads instance metadata and user data from the cloud.
- Applies:
- Hostname and networking
- SSH keys and users
- Package installation and upgrades
- Custom scripts and configuration
You usually supply a cloud-init user-data file in YAML.
Example (very common pattern):
#cloud-config
package_update: true
packages:
- nginx
users:
- name: devuser
groups: sudo
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh_authorized_keys:
- ssh-rsa AAAA... your_public_key_here
runcmd:
- systemctl enable --now nginxImportant Linux-side points:
cloud-initlogs live under/var/log/cloud-init.logand/var/log/cloud-init-output.log.- After boot, you can inspect what
cloud-initdid under/var/lib/cloud/. - You can re-run some
cloud-initstages using its CLI (varies by distribution).
cloud-init is central to boot-time provisioning of cloud Linux instances.
Identity and Access: SSH Keys and Metadata
SSH Key-Based Access
Cloud Linux instances normally disable password SSH login by default:
- You provide a public key when creating the instance.
- Cloud platform injects it into:
~defaultuser/.ssh/authorized_keys(e.g./home/ubuntu/.ssh/authorized_keys).- You connect using your private key:
ssh -i ~/.ssh/id_rsa ubuntu@PUBLIC_IP
On the instance, you manage SSH like any other Linux system:
- Server config:
/etc/ssh/sshd_config - Authorized keys:
~user/.ssh/authorized_keys - System users and groups as usual.
Cloud-specific considerations:
- Avoid embedding private keys in images or scripts.
- Use per-user key pairs, not shared organization-wide keys.
- Consider combining SSH keys with cloud IAM (e.g., ephemeral credentials, SSM-like solutions) for better control.
Instance Metadata Services
Most clouds expose an Instance Metadata Service (IMDS) to Linux instances:
- A special HTTP endpoint reachable only from the instance (e.g., at a fixed IP).
- Provides:
- Instance ID, hostname, region, AZ
- Attached networking info
- Temporary credentials (for roles/service accounts)
- Accessed like:
curl http://169.254.169.254/...Use cases on Linux:
- Automatically configure hostnames, log tags, or prompts (
PS1) using instance ID or tags. - Fetch credentials without hardcoding secrets in config files.
- Detect environment (prod vs dev) dynamically.
Security notes:
- Treat metadata response like sensitive data (credentials).
- Do not expose metadata via web applications.
Networking for Linux in Cloud Environments
Cloud networking is built around virtual networks and security groups. From Linux’s point of view, most of this is just another network interface, but there are differences.
Network Interfaces and IP Addressing
On a cloud Linux instance, you’ll see one or more interfaces:
- Typically named like
ens3,eth0,ens160, etc. - Configured via:
systemd-networkd,NetworkManager, or distro-specific tools.- DHCP or static configuration provided by the cloud.
Public vs private IP:
- You often get:
- Private IP: used within the virtual network.
- Public IP: mapped to the instance via NAT or direct association.
- Linux itself usually only knows about the IPs assigned; IPv4 NAT is handled by the provider.
Inside the instance:
- Check interfaces with
ip addrandip route. - DNS configuration in
/etc/resolv.conf(may be managed bysystemd-resolved).
Cloud-specific situations:
- IPs can change when you stop/start or recreate instances.
- Extra network interfaces can be attached or removed dynamically.
- You might rely on internal DNS names (provided by the cloud) instead of hardcoded IPs.
Security Groups and Local Firewalls
Cloud providers use security groups or similar concepts as a virtual firewall per instance or interface:
- They filter traffic before it reaches your Linux kernel.
- Rules typically specify:
- Allowed protocols and ports (e.g., TCP 22, 80, 443)
- Allowed sources (CIDR ranges, other security groups)
On Linux itself:
- You can still (and often should) use:
iptables/nftables- Higher-level tools like
ufworfirewalld(covered elsewhere)
Good practice:
- Use security groups as the first layer of defence.
- Use host firewalls for more granular or app-specific controls.
- Avoid relying solely on Linux firewall rules where possible: security groups are easier to centralize and audit.
Load Balancers and Linux Services
Cloud load balancers distribute traffic to multiple Linux instances:
- The instances just see incoming traffic as from the load balancer’s IPs.
- Applications may need to:
- Trust headers like
X-Forwarded-FororX-Forwarded-Proto. - Bind to private IP and port, not public IPs.
Linux perspective:
- The service (e.g., Nginx, Apache) listens on
0.0.0.0:PORTor127.0.0.1:PORT. - Scaling involves adding/removing Linux instances behind the load balancer, not changing the app config.
Storage and Filesystems in the Cloud
Cloud storage affects how you think about disks and persistence.
Root Disks and Attached Volumes
Typical layout:
- Root disk: virtual block device containing
/. - Additional volumes: attached as
xvdf,sdb, etc., and mountable like normal disks.
In Linux:
- List devices:
lsblk,fdisk -l - Filesystems:
mkfs.ext4 /dev/xvdf - Mount:
mount /dev/xvdf /data - Persist mounts in
/etc/fstabusing device names, UUIDs, or labels.
Cloud-specific aspects:
- Volumes can be detached and reattached to another instance (for recovery or migration).
- Volumes have performance characteristics (IOPS, throughput) that impact DBs and heavy workloads.
- Root disks of ephemeral instances may be deleted on termination by default.
Ephemeral Storage vs Persistent Volumes
Many instance types offer two classes of storage:
- Persistent volumes (like network-attached disks):
- Survive instance stop/termination if configured.
- Good for databases, app state, logs you care about.
- Ephemeral or instance store:
- Locally attached, very fast.
- Data is lost when instance stops/terminates.
- Good for caches, temporary data, scratch space.
Linux usage patterns:
- Mount ephemeral storage under
/tmp,/var/tmp,/scratch, or similar. - Ensure critical data lives on persistent volumes.
- Do not rely on
/for long-term state if the root disk is ephemeral or auto-deleted.
Object Storage vs Filesystems
Clouds provide object storage (e.g., buckets) which is not a traditional filesystem:
- Accessed over HTTP/HTTPS via APIs.
- You do not mount it like a normal block device (though some tools simulate this).
- Objects live in a flat namespace (key/value), not directories in the usual sense.
On Linux, you interact with object storage using:
- Command-line tools (often provider-specific).
- Libraries in your app.
- FUSE-based tools to mount buckets (with caveats: performance, consistency).
Common patterns:
- Store backups, logs, and large static files in object storage, not on instance disks.
- Use Linux cron jobs or systemd timers to sync data (e.g., via
rcloneor provider CLIs).
Automation and Immutable Infrastructure
Cloud and Linux go together strongly in the context of automation and immutable infrastructure.
Treating Instances as Disposable
Instead of manually maintaining long-lived servers, you:
- Create instances from images and configuration code.
- Apply configuration at boot with
cloud-initand configuration management. - Replace instances entirely when:
- Upgrading OS versions or key software.
- Changing base configuration.
On Linux:
- Avoid snowflake servers with many hand-applied changes.
- Store configuration in:
- Git (for playbooks, templates, scripts).
- Image build pipelines (e.g., Packer, CI/CD).
Bootstrapping with Cloud-Init and Configuration Management
Typical lifecycle:
- Instance starts from a base Linux image.
cloud-init:- Configures network/hostname.
- Adds SSH keys.
- Runs minimal configuration or agent installation.
- A configuration management tool (Ansible, Puppet, etc.):
- Installs services.
- Applies application config.
- Ensures systemd units are running.
From Linux’s perspective, these are just:
- Packages installed via normal package managers.
- Config files under
/etc/. systemdunits managed withsystemctl.
The difference is how they are triggered (automatically at boot, not manually).
Observability: Logging and Monitoring in the Cloud
System Logs and Cloud Logging
Linux in the cloud still logs to:
journalctl(systemd-journald)/var/log/*(traditional logs)
Cloud-native practices:
- Forward logs to centralized logging:
- Agent daemons that read from journal/log files and ship to provider logging.
- Avoid relying solely on logs on the instance, because:
- Instances are ephemeral.
- Disks may be lost on termination.
Typical Linux work:
- Install and configure the logging agent as a service.
- Choose between:
- Reading from journald (
journalctl --output=json). - Tail specific log files (
/var/log/nginx/access.log,/var/log/syslog, etc.).
Metrics and Health Checks
Instances expose:
- System metrics: CPU, memory, disk usage, network traffic.
- Application metrics: HTTP response times, error counts, etc.
In Linux:
- Use native tools for basic checks:
top,htop,vmstat,iostat. - Install cloud or third-party metrics agents.
- Implement simple health endpoints (e.g.,
/healthz) in your services.
Cloud integrates with:
- Auto-scaling (scale based on CPU, requests, custom metrics).
- Load balancer health checks (mark instance healthy/unhealthy).
Typical Cloud Linux Workflows
Bringing it all together, a practical sequence:
- Choose base image
- E.g., Ubuntu LTS server image.
- Prepare user data
- A small
cloud-initconfig to: - Install base tools (
htop,git,fail2ban, etc.). - Create a default user and set up SSH keys.
- Launch instance
- With a proper security group (SSH + app ports).
- With enough disk and correct volume types.
- Bootstrap configuration
cloud-initruns on first boot, then your config management.- Connect via SSH
- Use the configured user + key.
- Configure logs/metrics
- Install provider agent or
rsyslog/fluentd/promtail, etc. - Harden and automate
- Avoid manual drift; use automation for every repeatable change.
Everything you know about Linux still applies; cloud just gives you more programmatic control and expects you to treat servers as disposable resources.
Key Takeaways for Linux in the Cloud
- Most cloud infrastructure is Linux-based; understanding Linux gives you an advantage.
- Images + cloud-init + configuration management are the core building blocks.
- Networking and storage behave like normal Linux, with additional cloud abstractions layered on top.
- Embrace automation and immutability: don’t treat cloud Linux instances like unique pets.
- Centralize logs and metrics because instances can disappear at any time.
Subsequent chapters will show how these concepts map onto AWS, Azure, and GCP specifically when running Linux.