Table of Contents
Understanding How AWS Uses Linux
AWS is heavily built around Linux. Most compute and many managed services run some flavor of Linux under the hood. For you as a user, “Linux on AWS” mainly means:
- Running Linux virtual machines (EC2)
- Running container workloads on Linux (ECS, EKS, Fargate)
- Using AWS-provided Linux-based images (Amazon Linux, Bottlerocket, etc.)
- Managing and automating those systems using AWS tools
This chapter focuses on what is specific about running and managing Linux in AWS, not on generic Linux administration or generic cloud concepts (covered elsewhere in the course).
Common Linux Options on AWS
Amazon Linux
Amazon Linux is AWS’s own distro, optimized for AWS:
- Available as Amazon Linux 2 (traditional) and Amazon Linux 2023 (AL2023)
- Systemd-based, RPM/yum/dnf package management
- Tight integration with AWS tooling (CloudInit, SSM Agent, awscli often pre-installed)
- Performance and security defaults tuned for EC2
Typical use cases:
- General-purpose servers (web apps, APIs, CI runners)
- “Baseline” image for automation and autoscaling groups
- Environments where you want long-term support and AWS-optimized kernels
Key specifics:
- Package manager:
- Amazon Linux 2:
yum - Amazon Linux 2023:
dnf - Repositories maintained by AWS, not exactly RHEL/CentOS clones
- Preconfigured SELinux/AppArmor policies may differ from other enterprise distros
Other Popular Linux Distributions on AWS
AWS offers many official AMIs (Amazon Machine Images):
- Ubuntu (LTS and interim releases)
- RHEL
- SUSE Linux Enterprise
- Debian
- Rocky/Alma (RHEL-compatible)
- Specialized images (NVIDIA GPU-optimized, Deep Learning AMIs, etc.)
You select the distro at instance launch time by choosing the appropriate AMI.
Things that are AWS-specific:
- Most official images come with:
- CloudInit configured for AWS metadata
- The EC2 instance metadata service (IMDS) enabled
- The appropriate virtualization drivers (ENI, NVMe, etc.)
- AWS Marketplace AMIs may include:
- Pre-installed software (databases, security agents, monitoring)
- Licensing bundled into the EC2 hourly price
Launching Linux EC2 Instances (What’s Unique)
Although starting a Linux instance is GUI-driven in the AWS console, what’s specific for Linux on AWS is:
AMIs (Amazon Machine Images)
An AMI is effectively a bootable template containing:
- A root filesystem (with a Linux distro installed)
- Bootloader and init system configured for AWS
- Permissions that control who can launch it
When you select an AMI:
- Choose the right architecture:
x86_64vsarm64 - Check root volume type and size (e.g., gp3 EBS, 8 GB default)
- Confirm it supports your instance family (GPU, Graviton, etc.)
For automation (CLI or IaC), AMI IDs look like:
ami-0abcdef1234567890
Different regions have different AMI IDs even for the “same” image.
Instance Types and Linux
Linux supports the broadest selection of EC2 instance types. Two Linux-relevant aspects:
- Architecture:
t3,m5,c5, etc. are x86_64t4g,m7g,c7gare AWS Graviton (arm64)- Workload-optimized families:
c*for CPU-heavym*for generalr*for memory-heavyp,gfor GPU workloads
Linux kernels in the AMIs are built to support the corresponding virtualization environment (Nitro, ENA network driver, NVMe for EBS).
SSH Key Pairs
Linux instances are normally accessed with SSH keys:
- You create a key pair in AWS (or upload your public key)
- At launch, you associate that key pair with the instance
- CloudInit injects the public key into the default user’s
~/.ssh/authorized_keys
Typical default usernames by distro:
- Amazon Linux:
ec2-user - Ubuntu:
ubuntu - Debian:
adminordebian(varies by image) - RHEL / CentOS / Rocky:
ec2-user - SUSE:
ec2-userorroot(check image documentation)
SSH connection example:
ssh -i /path/to/key.pem ec2-user@ec2-203-0-113-10.compute-1.amazonaws.comBecause this is AWS-specific, you must:
- Ensure the security group allows inbound SSH (
tcp/22) from your IP - Use the correct public DNS or IP from the instance description
- Keep the
.pemfile secure and read-only (chmod 400 key.pem)
EC2 Linux Networking and Security (AWS-Specific Angles)
Security Groups vs Local Firewalls
In AWS, security groups act as a virtual firewall at the instance’s network interface level.
Typical pattern for Linux on AWS:
- Use security groups for coarse-grained access control:
- SSH from your IP range
- HTTP/HTTPS from the internet or a load balancer
- Optionally use local firewall (e.g.,
ufw,firewalld,iptables) for: - Extra internal segmentation
- Outbound restrictions
- Host-based policies
Remember:
- Even if Linux is listening on
0.0.0.0:22, the instance is unreachable unless the security group allows port 22. - For troubleshooting, check both:
- Security groups / NACLs
- Local firewall rules
Elastic IPs and DNS
Linux instances often need stable IPs:
- Elastic IP (EIP): a static public IPv4 that you can attach/detach from instances
- Inside the instance:
- You don’t manually configure the EIP; AWS routes it
- The Linux network interface sees only the instance’s private IP
Preferred practice:
- Use Route 53 (or another DNS) to map DNS names to EIPs or load balancers
- Configure applications on Linux to bind to private IP or
0.0.0.0and rely on AWS for external routing
Instance Metadata and Credentials
Every EC2 instance can access its metadata via a special HTTP endpoint:
- IP:
169.254.169.254 - Commonly used path:
/latest/meta-data/ - Accessible only from inside the instance
Example (Linux, via curl):
curl http://169.254.169.254/latest/meta-data/instance-idThis gives the instance ID, but more importantly, the metadata service is how:
- CloudInit retrieves user data
- AWS SDKs and CLI obtain temporary credentials from IAM roles attached to the instance
You rarely query the credentials manually; instead, you:
- Attach an IAM role with permissions (e.g.,
S3ReadOnly) - Use the AWS CLI or SDK on the Linux instance:
aws s3 ls s3://my-bucketThe CLI fetches temp credentials from the metadata service automatically; no hard-coded keys needed.
Bootstrapping and Configuration: CloudInit & User Data
User Data
When launching a Linux instance, you can supply “user data”:
- A shell script or CloudInit configuration
- Executed/processed on the first boot by CloudInit or a similar agent
Example simple user data (bash script):
#!/bin/bash
yum update -y
yum install -y httpd
systemctl enable --now httpd
echo "Hello from AWS Linux" > /var/www/html/index.htmlThis is AWS-specific in that:
- User data is stored in the instance metadata and injected at launch
- It allows you to avoid manual configuration over SSH
- It’s often combined with autoscaling groups to bootstrap new Linux instances automatically
CloudInit Basics (AWS Perspective)
Most official Linux AMIs on AWS come with CloudInit configured.
CloudInit can:
- Create users
- Install packages
- Write files and templates
- Run commands at specific boot stages
YAML-style CloudInit example:
#cloud-config
packages:
- nginx
users:
- name: deploy
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh_authorized_keys:
- ssh-rsa AAAA...
runcmd:
- systemctl enable --now nginxNotes specific to AWS:
- CloudInit pulls instance metadata and user data from the EC2 metadata service
- You can re-run or debug CloudInit via logs (e.g.,
/var/log/cloud-init.log) - CloudInit behavior is AMI-specific; always check distro docs for details
Storage: EBS and Instance Store from a Linux Perspective
EBS Volumes
EBS (Elastic Block Store) volumes show up as block devices in Linux, usually as:
/dev/nvme0n1,/dev/nvme1n1(Nitro instances)- Or legacy
/dev/xvda,/dev/xvdb(older generations or specific AMIs)
AWS specifics:
- Root volume:
- Created from the AMI snapshot
- Often
ext4orxfsfilesystem pre-configured - Additional volumes:
- Must be partitioned and formatted from inside Linux (if blank)
- Are attached/detached via AWS, but mounted/unmounted inside Linux
Example: formatting and mounting an extra EBS volume in Linux:
# Check devices
lsblk
# Suppose /dev/nvme1n1 is the new EBS volume
sudo mkfs.ext4 /dev/nvme1n1
sudo mkdir -p /data
sudo mount /dev/nvme1n1 /data
# To persist across reboots, add to /etc/fstab:
echo '/dev/nvme1n1 /data ext4 defaults,nofail 0 2' | sudo tee -a /etc/fstabInstance Store
Some instance types provide “instance store” (ephemeral storage):
- Very fast, often NVMe-backed
- Data is lost when the instance stops or is terminated
- Appears as block devices similar to EBS (e.g.,
/dev/nvme2n1)
Linux usage:
- Format and mount like any other disk
- Use for:
- Caches
- Temporary build artifacts
- High-speed scratch space
- Never for irreplaceable data
Snapshots and AMI Creation
From a Linux perspective:
- EBS snapshots are taken at the block level; the filesystem should be consistent
- For a clean snapshot of a busy filesystem:
- Use
fsfreeze(if available) or stop services / flush data - Then trigger snapshot via AWS console/CLI
Snapshot example from CLI (outside instance):
aws ec2 create-snapshot --volume-id vol-0123456789abcdef0 --description "Data volume backup"You can also create a new AMI from a configured Linux instance:
- Clean up logs/temp files
- Shut down or quiesce critical services if consistency is important
- Use “Create Image” from that instance to get a reusable AMI (for autoscaling, cloning, etc.)
Managing and Automating Linux on AWS
AWS CLI and SDK on Linux Instances
Linux hosts in AWS often run the AWS CLI:
Installation example on Amazon Linux 2:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/installPaired with an instance role, this lets the instance manage AWS resources, e.g.:
# List S3 buckets
aws s3 ls
# Query instance's own region
aws ec2 describe-instances --instance-ids "$(curl -s http://169.254.169.254/latest/meta-data/instance-id)" \
--query 'Reservations[0].Instances[0].Placement.AvailabilityZone'In DevOps workflows, Linux servers on AWS commonly:
- Run CI/CD agents
- Provision resources with Terraform, Ansible, or CloudFormation
- Rotate secrets, manage backups, push logs, etc.
AWS Systems Manager (SSM) and the SSM Agent
Systems Manager allows you to manage Linux instances without SSH:
- Requires SSM Agent installed on the instance
- Preinstalled on Amazon Linux and many official AMIs
- Requires IAM role with SSM permissions attached to the instance
Key Linux-focused features:
- Session Manager:
- Browser-based or CLI terminal directly into the instance
- No need to open port 22
- Uses the agent over HTTPS
- Run Command:
- Run shell commands on one or many Linux instances
- Example: patching, configuration changes, log collection
- Patch Manager:
- Orchestrates OS updates across fleets of Linux instances
Example of starting a shell via SSM (from your local machine):
aws ssm start-session --target i-0123456789abcdef0This is especially useful for hardened environments with no direct SSH access.
Logging and Monitoring with CloudWatch
Linux logs and metrics can be sent to AWS CloudWatch:
- CloudWatch Agent or legacy
awslogsagent runs on the instance - Collects:
- System logs (
/var/log/messages,/var/log/syslog) - Application logs (e.g.,
/var/log/nginx/access.log) - System metrics (CPU, memory, disk, etc.)
Configuration typically done via:
- JSON or TOML config file for the agent
- An SSM parameter storing agent configuration (pulled by instances)
This enables:
- Centralized log viewing in CloudWatch Logs
- Alerts (CloudWatch Alarms) based on metrics/regex patterns in logs
- Integration with other AWS services (Lambda, S3, etc.) for processing
Linux Containers on AWS (High-Level AWS-Specific Points)
A full containers chapter exists elsewhere; here’s what is AWS-specific:
- ECS on EC2:
- Linux instances in an Auto Scaling Group run the ECS agent
- ECS schedules Docker containers onto those Linux hosts
- ECS on Fargate:
- AWS manages the underlying Linux; you don’t administer the host OS
- EKS (Kubernetes):
- Worker nodes are EC2 instances running Linux, or Fargate-backed
- Commonly use Amazon EKS-optimized AMIs (custom Linux tuned for Kubernetes)
- Bottlerocket:
- AWS’s container-optimized Linux OS
- Minimal, immutable, configured mainly via user data / APIs
- Used as a host OS for ECS/EKS nodes
Linux-specific concerns:
- You manage:
- Kernel versions and container runtime on EC2-based nodes
- Node security (SSH, patching) for EC2 worker nodes
- AWS manages:
- Underlying Linux OS on Fargate
- Control plane on EKS (managed service)
Security Considerations for Linux on AWS
Beyond generic Linux security, AWS-specific practices include:
- Prefer IAM roles over static access keys:
- For both EC2 and containers
- Avoid storing credentials in files or environment variables
- Restrict SSH:
- Limit security group to trusted IPs or use SSM Session Manager instead
- Consider disabling password auth entirely (
PasswordAuthentication noinsshd_config) - Regular patching:
- Use SSM Patch Manager or automation (cron +
yum update -y,apt upgrade -y) plus change control - Encrypt at rest:
- Use EBS volume encryption
- Transparent to Linux (Linux sees an unencrypted block device)
- Use least-privilege security groups:
- Only open the ports your Linux services actually need (80/443, 22, etc.)
Putting It Together: Typical Linux on AWS Workflow
A common real-world workflow for Linux on AWS might look like:
- Choose a Linux AMI (e.g., Amazon Linux 2023).
- Define:
- Security group (SSH allowed only from office IP)
- IAM role (e.g., S3 access, CloudWatch logging, SSM)
- Create a key pair (or rely on SSM + no SSH).
- Launch EC2 instance with:
- User data script to install app dependencies
- Appropriate storage (root + extra EBS volume)
- Log in via SSH or Session Manager and:
- Verify services
- Check
/var/log/cloud-init.logfor bootstrap issues - Configure application
- Create an AMI from this configured Linux instance for:
- Autoscaling group
- Quick environment recreation
- Attach CloudWatch Agent / logs and SSM for ongoing management.
- Use IaC tools (Terraform, CloudFormation, Ansible) to codify the above so future Linux environments are reproducible.
This chapter’s goal is to make those AWS-specific Linux behaviors and tools understandable so you can confidently operate Linux systems in AWS environments.