Linux on AWS

Table of Contents

Understanding How AWS Uses Linux

AWS is heavily built around Linux. Most compute and many managed services run some flavor of Linux under the hood. For you as a user, “Linux on AWS” mainly means:

Running Linux virtual machines (EC2)
Running container workloads on Linux (ECS, EKS, Fargate)
Using AWS-provided Linux-based images (Amazon Linux, Bottlerocket, etc.)
Managing and automating those systems using AWS tools

This chapter focuses on what is specific about running and managing Linux in AWS, not on generic Linux administration or generic cloud concepts (covered elsewhere in the course).

Common Linux Options on AWS

Amazon Linux

Amazon Linux is AWS’s own distro, optimized for AWS:

Available as Amazon Linux 2 (traditional) and Amazon Linux 2023 (AL2023)
Systemd-based, RPM/yum/dnf package management
Tight integration with AWS tooling (CloudInit, SSM Agent, awscli often pre-installed)
Performance and security defaults tuned for EC2

Typical use cases:

General-purpose servers (web apps, APIs, CI runners)
“Baseline” image for automation and autoscaling groups
Environments where you want long-term support and AWS-optimized kernels

Key specifics:

Package manager:

Amazon Linux 2: yum
Amazon Linux 2023: dnf

Repositories maintained by AWS, not exactly RHEL/CentOS clones
Preconfigured SELinux/AppArmor policies may differ from other enterprise distros

Other Popular Linux Distributions on AWS

AWS offers many official AMIs (Amazon Machine Images):

Ubuntu (LTS and interim releases)
RHEL
SUSE Linux Enterprise
Debian
Rocky/Alma (RHEL-compatible)
Specialized images (NVIDIA GPU-optimized, Deep Learning AMIs, etc.)

You select the distro at instance launch time by choosing the appropriate AMI.

Things that are AWS-specific:

Most official images come with:

CloudInit configured for AWS metadata
The EC2 instance metadata service (IMDS) enabled
The appropriate virtualization drivers (ENI, NVMe, etc.)

AWS Marketplace AMIs may include:

Pre-installed software (databases, security agents, monitoring)
Licensing bundled into the EC2 hourly price

Launching Linux EC2 Instances (What’s Unique)

Although starting a Linux instance is GUI-driven in the AWS console, what’s specific for Linux on AWS is:

AMIs (Amazon Machine Images)

An AMI is effectively a bootable template containing:

A root filesystem (with a Linux distro installed)
Bootloader and init system configured for AWS
Permissions that control who can launch it

When you select an AMI:

Choose the right architecture: x86_64 vs arm64
Check root volume type and size (e.g., gp3 EBS, 8 GB default)
Confirm it supports your instance family (GPU, Graviton, etc.)

For automation (CLI or IaC), AMI IDs look like:

ami-0abcdef1234567890

Different regions have different AMI IDs even for the “same” image.

Instance Types and Linux

Linux supports the broadest selection of EC2 instance types. Two Linux-relevant aspects:

Architecture:

t3, m5, c5, etc. are x86_64
t4g, m7g, c7g are AWS Graviton (arm64)

Workload-optimized families:

c* for CPU-heavy
m* for general
r* for memory-heavy
p, g for GPU workloads

Linux kernels in the AMIs are built to support the corresponding virtualization environment (Nitro, ENA network driver, NVMe for EBS).

SSH Key Pairs

Linux instances are normally accessed with SSH keys:

You create a key pair in AWS (or upload your public key)
At launch, you associate that key pair with the instance
CloudInit injects the public key into the default user’s ~/.ssh/authorized_keys

Typical default usernames by distro:

Amazon Linux: ec2-user
Ubuntu: ubuntu
Debian: admin or debian (varies by image)
RHEL / CentOS / Rocky: ec2-user
SUSE: ec2-user or root (check image documentation)

SSH connection example:

ssh -i /path/to/key.pem ec2-user@ec2-203-0-113-10.compute-1.amazonaws.com

Because this is AWS-specific, you must:

Ensure the security group allows inbound SSH (tcp/22) from your IP
Use the correct public DNS or IP from the instance description
Keep the .pem file secure and read-only (chmod 400 key.pem)

EC2 Linux Networking and Security (AWS-Specific Angles)

Security Groups vs Local Firewalls

In AWS, security groups act as a virtual firewall at the instance’s network interface level.

Typical pattern for Linux on AWS:

Use security groups for coarse-grained access control:

SSH from your IP range
HTTP/HTTPS from the internet or a load balancer

Optionally use local firewall (e.g., ufw, firewalld, iptables) for:

Extra internal segmentation
Outbound restrictions
Host-based policies

Remember:

Even if Linux is listening on 0.0.0.0:22, the instance is unreachable unless the security group allows port 22.
For troubleshooting, check both:

Security groups / NACLs
Local firewall rules

Elastic IPs and DNS

Linux instances often need stable IPs:

Elastic IP (EIP): a static public IPv4 that you can attach/detach from instances
Inside the instance:

You don’t manually configure the EIP; AWS routes it
The Linux network interface sees only the instance’s private IP

Preferred practice:

Use Route 53 (or another DNS) to map DNS names to EIPs or load balancers
Configure applications on Linux to bind to private IP or 0.0.0.0 and rely on AWS for external routing

Instance Metadata and Credentials

Every EC2 instance can access its metadata via a special HTTP endpoint:

IP: 169.254.169.254
Commonly used path: /latest/meta-data/
Accessible only from inside the instance

Example (Linux, via curl):

curl http://169.254.169.254/latest/meta-data/instance-id

This gives the instance ID, but more importantly, the metadata service is how:

CloudInit retrieves user data
AWS SDKs and CLI obtain temporary credentials from IAM roles attached to the instance

You rarely query the credentials manually; instead, you:

Attach an IAM role with permissions (e.g., S3ReadOnly)
Use the AWS CLI or SDK on the Linux instance:

aws s3 ls s3://my-bucket

The CLI fetches temp credentials from the metadata service automatically; no hard-coded keys needed.

Bootstrapping and Configuration: CloudInit & User Data

User Data

When launching a Linux instance, you can supply “user data”:

A shell script or CloudInit configuration
Executed/processed on the first boot by CloudInit or a similar agent

Example simple user data (bash script):

#!/bin/bash
yum update -y
yum install -y httpd
systemctl enable --now httpd
echo "Hello from AWS Linux" > /var/www/html/index.html

This is AWS-specific in that:

User data is stored in the instance metadata and injected at launch
It allows you to avoid manual configuration over SSH
It’s often combined with autoscaling groups to bootstrap new Linux instances automatically

CloudInit Basics (AWS Perspective)

Most official Linux AMIs on AWS come with CloudInit configured.

CloudInit can:

Create users
Install packages
Write files and templates
Run commands at specific boot stages

YAML-style CloudInit example:

#cloud-config
packages:
  - nginx
users:
  - name: deploy
    sudo: ['ALL=(ALL) NOPASSWD:ALL']
    ssh_authorized_keys:
      - ssh-rsa AAAA...
runcmd:
  - systemctl enable --now nginx

Notes specific to AWS:

CloudInit pulls instance metadata and user data from the EC2 metadata service
You can re-run or debug CloudInit via logs (e.g., /var/log/cloud-init.log)
CloudInit behavior is AMI-specific; always check distro docs for details

Storage: EBS and Instance Store from a Linux Perspective

EBS Volumes

EBS (Elastic Block Store) volumes show up as block devices in Linux, usually as:

/dev/nvme0n1, /dev/nvme1n1 (Nitro instances)
Or legacy /dev/xvda, /dev/xvdb (older generations or specific AMIs)

AWS specifics:

Root volume:

Created from the AMI snapshot
Often ext4 or xfs filesystem pre-configured

Additional volumes:

Must be partitioned and formatted from inside Linux (if blank)
Are attached/detached via AWS, but mounted/unmounted inside Linux

Example: formatting and mounting an extra EBS volume in Linux:

# Check devices
lsblk
# Suppose /dev/nvme1n1 is the new EBS volume
sudo mkfs.ext4 /dev/nvme1n1
sudo mkdir -p /data
sudo mount /dev/nvme1n1 /data
# To persist across reboots, add to /etc/fstab:
echo '/dev/nvme1n1 /data ext4 defaults,nofail 0 2' | sudo tee -a /etc/fstab

Instance Store

Some instance types provide “instance store” (ephemeral storage):

Very fast, often NVMe-backed
Data is lost when the instance stops or is terminated
Appears as block devices similar to EBS (e.g., /dev/nvme2n1)

Linux usage:

Format and mount like any other disk
Use for:

Caches
Temporary build artifacts
High-speed scratch space

Never for irreplaceable data

Snapshots and AMI Creation

From a Linux perspective:

EBS snapshots are taken at the block level; the filesystem should be consistent
For a clean snapshot of a busy filesystem:

Use fsfreeze (if available) or stop services / flush data
Then trigger snapshot via AWS console/CLI

Snapshot example from CLI (outside instance):

aws ec2 create-snapshot --volume-id vol-0123456789abcdef0 --description "Data volume backup"

You can also create a new AMI from a configured Linux instance:

Clean up logs/temp files
Shut down or quiesce critical services if consistency is important
Use “Create Image” from that instance to get a reusable AMI (for autoscaling, cloning, etc.)

Managing and Automating Linux on AWS

AWS CLI and SDK on Linux Instances

Linux hosts in AWS often run the AWS CLI:

Installation example on Amazon Linux 2:

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

Paired with an instance role, this lets the instance manage AWS resources, e.g.:

# List S3 buckets
aws s3 ls
# Query instance's own region
aws ec2 describe-instances --instance-ids "$(curl -s http://169.254.169.254/latest/meta-data/instance-id)" \
  --query 'Reservations[0].Instances[0].Placement.AvailabilityZone'

In DevOps workflows, Linux servers on AWS commonly:

Run CI/CD agents
Provision resources with Terraform, Ansible, or CloudFormation
Rotate secrets, manage backups, push logs, etc.

AWS Systems Manager (SSM) and the SSM Agent

Systems Manager allows you to manage Linux instances without SSH:

Requires SSM Agent installed on the instance

Preinstalled on Amazon Linux and many official AMIs

Requires IAM role with SSM permissions attached to the instance

Key Linux-focused features:

Session Manager:

Browser-based or CLI terminal directly into the instance
No need to open port 22
Uses the agent over HTTPS

Run Command:

Run shell commands on one or many Linux instances
Example: patching, configuration changes, log collection

Patch Manager:

Orchestrates OS updates across fleets of Linux instances

Example of starting a shell via SSM (from your local machine):

aws ssm start-session --target i-0123456789abcdef0

This is especially useful for hardened environments with no direct SSH access.

Logging and Monitoring with CloudWatch

Linux logs and metrics can be sent to AWS CloudWatch:

CloudWatch Agent or legacy awslogs agent runs on the instance
Collects:

System logs (/var/log/messages, /var/log/syslog)
Application logs (e.g., /var/log/nginx/access.log)
System metrics (CPU, memory, disk, etc.)

Configuration typically done via:

JSON or TOML config file for the agent
An SSM parameter storing agent configuration (pulled by instances)

This enables:

Centralized log viewing in CloudWatch Logs
Alerts (CloudWatch Alarms) based on metrics/regex patterns in logs
Integration with other AWS services (Lambda, S3, etc.) for processing

Linux Containers on AWS (High-Level AWS-Specific Points)

A full containers chapter exists elsewhere; here’s what is AWS-specific:

ECS on EC2:

Linux instances in an Auto Scaling Group run the ECS agent
ECS schedules Docker containers onto those Linux hosts

ECS on Fargate:

AWS manages the underlying Linux; you don’t administer the host OS

EKS (Kubernetes):

Worker nodes are EC2 instances running Linux, or Fargate-backed
Commonly use Amazon EKS-optimized AMIs (custom Linux tuned for Kubernetes)

Bottlerocket:

AWS’s container-optimized Linux OS
Minimal, immutable, configured mainly via user data / APIs
Used as a host OS for ECS/EKS nodes

Linux-specific concerns:

You manage:

Kernel versions and container runtime on EC2-based nodes
Node security (SSH, patching) for EC2 worker nodes

AWS manages:

Underlying Linux OS on Fargate
Control plane on EKS (managed service)

Security Considerations for Linux on AWS

Beyond generic Linux security, AWS-specific practices include:

Prefer IAM roles over static access keys:

For both EC2 and containers
Avoid storing credentials in files or environment variables

Restrict SSH:

Limit security group to trusted IPs or use SSM Session Manager instead
Consider disabling password auth entirely (PasswordAuthentication no in sshd_config)

Regular patching:

Use SSM Patch Manager or automation (cron + yum update -y, apt upgrade -y) plus change control

Encrypt at rest:

Use EBS volume encryption
Transparent to Linux (Linux sees an unencrypted block device)

Use least-privilege security groups:

Only open the ports your Linux services actually need (80/443, 22, etc.)

Putting It Together: Typical Linux on AWS Workflow

A common real-world workflow for Linux on AWS might look like:

Choose a Linux AMI (e.g., Amazon Linux 2023).
Define:

Security group (SSH allowed only from office IP)
IAM role (e.g., S3 access, CloudWatch logging, SSM)

Create a key pair (or rely on SSM + no SSH).
Launch EC2 instance with:

User data script to install app dependencies
Appropriate storage (root + extra EBS volume)

Verify services
Check /var/log/cloud-init.log for bootstrap issues
Configure application

Create an AMI from this configured Linux instance for:

Autoscaling group
Quick environment recreation

Attach CloudWatch Agent / logs and SSM for ongoing management.
Use IaC tools (Terraform, CloudFormation, Ansible) to codify the above so future Linux environments are reproducible.

This chapter’s goal is to make those AWS-specific Linux behaviors and tools understandable so you can confidently operate Linux systems in AWS environments.

Comments

Please login to add a comment.

Don't have an account? Register now!