4.7.3 Docker fundamentals

Table of Contents

Overview of Docker

Docker is a platform that lets you package an application and its dependencies into a portable unit called a container. A container runs on top of a Linux kernel using features such as namespaces and cgroups, but you do not have to manage those details directly when working with Docker.

The key idea is that you build or download images, then run containers from those images. Images are read only templates. Containers are running, writable instances of those templates. Docker provides a client command line tool, a server component called the Docker daemon, and a registry where images are stored and shared.

Docker is not a full virtual machine system. Docker containers share the host kernel. This makes them lightweight and very fast to start compared to traditional virtual machines.

Important: A Docker image is the template. A Docker container is a running instance of that template.

Docker Architecture on Linux

On a Linux system, Docker uses a client server model. The docker command line tool is the client. It talks to a long running background process called the Docker daemon, usually dockerd. The daemon interacts with the Linux kernel, manages images and containers, and communicates with registries such as Docker Hub.

By default, the Docker client connects to the daemon through a Unix socket located at /var/run/docker.sock. Administrative operations usually require root privileges. On most distributions, the recommended way is to add your user to the docker group and then use docker without sudo.

The daemon itself uses storage drivers to manage container filesystems and network drivers to provide virtual networks for containers. Details of those drivers are part of broader container internals, so in this chapter you only need to recognize that Docker relies on Linux features but abstracts them away.

Images and Layers

Every Docker image consists of a stack of read only layers. Each layer records filesystem changes relative to the previous one. When you build a new image from a base image, Docker creates additional layers on top of the base layers.

This layered design has two important effects. First, images can share common base layers, which saves disk space and speeds up downloads. Second, Docker can cache layers during builds. If a layer has not changed, Docker does not rebuild it.

When a container starts, Docker creates a thin writable layer on top of the image layers. Any file you modify or create inside the running container lives in that top writable layer unless you explicitly mount volumes.

You can see images on your system with docker images and remove them with docker rmi. When you pull an image using docker pull, Docker downloads all of its layers from a registry.

Rule: Image layers are immutable. Only the top container layer is writable, unless you use volumes or bind mounts.

Basic Image Usage

The most common way to get images is from Docker Hub, the default public registry. When you run a container from an image that is not present locally, Docker automatically pulls it.

For example, the command

docker pull alpine

downloads the alpine image. Alpine is a very small Linux distribution often used for minimal containers. Afterwards, docker images will show the image with its repository name, tag, image ID, creation time, and size.

Images are identified by a repository name and an optional tag in the form name:tag. If you omit the tag, Docker uses the latest tag by default. For example, ubuntu:22.04 and ubuntu:20.04 are different images, while ubuntu is equivalent to ubuntu:latest on Docker Hub.

You can remove unused images with docker rmi image_name:tag. If a container still uses an image, Docker will not remove it until the container is deleted or you force removal.

Running Containers

Running a simple container starts with the docker run command. It creates a new container from an image, optionally downloads the image if needed, and starts a process inside that container.

For example, you can run

docker run alpine echo "Hello from a container"

This command pulls the alpine image if needed, starts a container, executes echo "Hello from a container" in it, prints the output, and exits. The container then stops. Docker keeps a record of this stopped container until you remove it.

By default, many images run some kind of main process, such as a shell or a server. If you want to override the default command, you add it to the end of the docker run line as shown above.

The container lifecycle is simple. A container exists, then it is running while its main process is active, and it stops when the main process exits. You can view all containers, including stopped ones, with docker ps -a, and only running ones with docker ps. You remove containers with docker rm.

If you want the container to be deleted automatically when it stops, use the --rm flag with docker run. This is convenient for short lived test containers.

Interactive Containers and TTY

To explore a container interactively, you typically start a shell inside it. On most images, the default shell is sh or bash. You also need a pseudo terminal and input attached.

You can get an interactive shell inside an image with

docker run -it ubuntu bash

Here -i keeps standard input open and -t allocates a terminal. You will see a shell prompt from inside the container and can run commands as if you were in a small Linux environment isolated from your host.

When you type exit, the shell process ends, which stops the container. Unless you used --rm, the container remains in the stopped state and you can see it with docker ps -a. You can also restart it using docker start -i container_name_or_id to attach again.

Running interactive containers is useful for inspection, debugging, and learning. For production workloads you usually run containers in the background as services.

Detached Mode and Background Containers

When you start a service like a web server, you do not want to keep the terminal attached. Docker can run containers in detached mode using the -d flag. For example:

docker run -d nginx

This command starts an nginx web server container in the background. Docker prints the container ID and returns you to your shell prompt. The container continues running until you stop it.

You can stop a running container with docker stop container_name_or_id. This sends a graceful termination signal to the main process inside the container. If the process does not exit within a timeout, Docker sends a stronger signal and eventually kills it.

If you want to see what a detached container prints to its standard output and error streams, use:

docker logs container_name_or_id

You can follow output in real time with docker logs -f. This is especially important when containers do not have an attached terminal in detached mode.

Container Names and IDs

Every container has a long hexadecimal ID and can also have a human friendly name. If you do not specify a name, Docker will assign a random one. You can set your own name with the --name flag when you run a container, like this:

docker run --name my_nginx -d nginx

Using names makes docker stop, docker logs, and docker exec easier to use. Both the container ID and the name can identify containers in Docker commands. ID prefixes are usually enough as long as they are unique.

Docker also supports renaming containers with docker rename. However, since containers are often treated as disposable, many setups do not depend on stable names and instead rely on orchestration tools or configuration files.

Ports and Networking Basics

Containers by default are isolated from the outside network for inbound connections, but they can usually make outbound connections through the host. If a container runs a service that listens on a TCP port, you must publish that port if you want to access it from the host or from outside.

Publishing a port uses the -p flag with docker run. The general form is -p host_port:container_port. For example, to expose nginx running on port 80 inside the container to port 8080 on the host, you can run:

docker run -d -p 8080:80 nginx

Now, visiting http://localhost:8080 on the host should connect to the nginx server inside the container. Docker performs network address translation between the host port and the container port.

You can publish multiple ports by repeating the -p option. If you omit the host port and use only -p :container_port, Docker chooses a random free port on the host. To see which ports are published for a container, use docker ps and inspect the PORTS column, or use docker inspect.

For communication between containers, Docker provides virtual networks that containers can join. Detailed custom networking is a separate topic, so in this chapter it is enough to understand that -p maps ports from the host to the container.

Volumes and Persistent Data

Container filesystems are ephemeral. When you delete a container, its writable layer is lost with any changes made to it. To keep data persistent and to share data between host and container or between multiple containers, Docker uses volumes and bind mounts.

A volume is managed by Docker and stored in a location under Docker’s control, usually under /var/lib/docker/volumes. You create and attach a volume using -v or --mount with docker run. For example:

docker run -d -v mydata:/var/lib/mysql mysql

Here mydata is a named volume. Docker creates it if it does not exist and mounts it at /var/lib/mysql inside the container. Deleting the container does not delete the volume by default, so data persists.

A bind mount maps an existing directory or file from the host filesystem into the container. The syntax is similar, but you specify a host path. For example:

docker run -d -v /home/user/site:/usr/share/nginx/html nginx

This maps the host directory /home/user/site into the container at /usr/share/nginx/html. Changes in that directory on the host appear inside the container and vice versa.

Rule: Do not store important data only in the container writable layer. Use volumes or bind mounts for persistent or shared data.

Working with Logs and Processes in Containers

Logs from processes in containers are essential for debugging. Docker automatically captures everything a process writes to standard output and standard error and stores it using a log driver.

The basic way to see these logs is through docker logs. For example:

docker logs my_nginx

prints the accumulated logs for the container. Using docker logs -f my_nginx follows the log output similar to tail -f. This is often combined with detached containers to monitor behavior without attaching a shell.

If you want to run additional commands inside a running container, use docker exec. For instance,

docker exec -it my_nginx sh

starts an interactive shell inside the my_nginx container. This is different from docker run because it enters an existing container instead of creating a new one. This is useful for inspecting configuration or checking the state of long running containers while they are active.

You can also run non interactive commands with docker exec, such as:

docker exec my_nginx ps aux

to list processes inside the container. The process list shows only processes running in that container since containers use process isolation.

Dockerfiles and Building Images

To create your own Docker images, you define a set of instructions in a file called a Dockerfile. A Dockerfile starts from a base image and describes steps such as copying files, installing packages, and configuring environment variables.

A minimal Dockerfile might contain:

FROM alpine:latest
RUN apk add --no-cache curl
CMD ["curl", "--version"]

Here, FROM sets the base image. RUN executes a command while building the image and saves the result as a new layer. CMD defines the default command that runs when a container is started from the image.

You build an image using:

docker build -t my_alpine_curl .

where . is the build context, usually the directory containing the Dockerfile. The -t option assigns a repository name and tag, in this case my_alpine_curl:latest.

After the build succeeds, you can run your new image with:

docker run my_alpine_curl

which executes curl --version in a container. Image building becomes much more powerful when you add configuration files and application code to the build context, but those details belong in deeper container and build topics.

Image Registries and Pushing Images

To share images with others or deploy them to servers, you need an image registry. Docker Hub is the best known public registry, but many organizations use private registries.

After you log in with docker login registry_url if needed, you can tag your image with the registry address and push it. Tagging uses docker tag with the form source target. For example:

docker tag my_alpine_curl myuser/my_alpine_curl:1.0

Then you can push it to Docker Hub with:

docker push myuser/my_alpine_curl:1.0

Other machines can then run docker pull myuser/my_alpine_curl:1.0 to download and use that image.

Registries store the same image layer structure that exists locally. Pushing and pulling reuse layers that already exist on either side, which minimizes network transfer.

Cleaning Up Containers and Images

Docker hosts can accumulate unused containers, images, volumes, and networks over time. Regular cleanup prevents wasted disk space. At a basic level, you can remove stopped containers with docker rm and unneeded images with docker rmi.

Docker also provides pruning commands. docker system prune deletes stopped containers, dangled images, and unused networks, but it does not touch volumes by default. Adding the --volumes flag tells Docker to remove unused volumes too. These commands are powerful and can remove resources you still need, so they must be used with care.

You can also target specific resource types. docker container prune removes all stopped containers. docker image prune removes unused images that are not tagged and not in use. Volumes can be listed with docker volume ls and removed with docker volume rm when you are sure they are no longer required.

Rule: Always verify which resources will be deleted before running prune commands on a production or important system.

Security and Root Privileges

On most Linux systems, Docker containers run processes as root inside the container by default, and the Docker daemon itself usually runs as root on the host. This combination has important security implications.

Any user who can run docker commands can effectively gain root level access on the host if the system is misconfigured. For this reason, membership in the docker group must be treated similarly to sudo access.

Within a container, you can improve security by running processes as a non root user, by dropping unnecessary capabilities, and by using read only filesystems and restricted volumes. These practices are part of broader container hardening and are not fully covered here, but they should always be considered when deploying containers in sensitive environments.

Recognizing that Docker is not a sandbox with strong isolation by default is important. It is a packaging and deployment tool that depends on careful configuration and additional security features for high assurance isolation.

Summary

Docker provides a convenient way to package applications into images and run them as containers on Linux. Images are immutable and layered, containers are running instances with an additional writable layer, and the Docker CLI communicates with the daemon to manage them. You now know how to pull images, run containers in interactive and detached modes, publish ports, use volumes for persistence, inspect logs, execute commands inside running containers, build custom images using Dockerfiles, push and pull images from registries, and perform basic cleanup. These fundamentals prepare you to explore more advanced topics such as orchestration and deep container security.

Comments

Please login to add a comment.

Don't have an account? Register now!