5 Working with Docker Images

Table of Contents

Why Images Matter

Docker images are the foundation of everything you run with Docker. A running container is created from an image, similar to how a process on your operating system is created from an executable file. Images define what software is inside your container, which operating system files it uses, which dependencies are available, and what default command should run.

In this chapter you will understand images from a practical perspective. You will learn how they are stored and reused, why they are so fast to download and start, and how they relate to containers. You will not yet build your own images from Dockerfiles, that comes in a later chapter, but you will gain enough understanding to work confidently with images that already exist.

An image is a read-only template. Every container you run from that image gets its own writable layer on top, but the underlying image content never changes.

This read-only nature is central to how Docker achieves consistency. If you and a colleague pull the same image tag from a registry, you should both get identical files and behavior, regardless of your host operating systems.

Images as Templates and Blueprints

You can think of an image as a blueprint for containers. The image contains everything needed to run an application. This usually includes a minimal user space, system utilities required by the app, language runtimes such as Python or Node.js, application binaries, and configuration defaults.

When you execute a command such as docker run someimage, Docker does not modify someimage. Instead it creates a new container that points to the image and adds a thin writable layer where changes inside the container are recorded. This separation means that you can create many containers from a single image, each with different data or state, all based on the same underlying definition.

Because images are immutable, you rarely talk about "updating" an image in place. Instead, you create or pull a new version of that image and start new containers from it. Old containers still reference the old image, new containers reference the new one. This model makes rollbacks and reproducible setups much easier than traditional manual server configuration.

How Docker Stores Images Locally

When you pull or build an image, Docker stores it in a local image cache. This is managed by the Docker Engine, not by your operating system's normal package manager. Each image has an identifier, a content digest, and usually one or more human readable names called tags.

The local image cache allows Docker to reuse images and their parts without repeatedly downloading or rebuilding them. If you already have an image on your machine, running a container from it is almost instant, because Docker only needs to create the container metadata and the writable layer, not download anything.

Images from different registries, for example Docker Hub or a private registry, occupy the same local cache but are namespaced by their full names. If you run ubuntu:22.04 and later myregistry.example.com/custom/ubuntu:22.04, Docker treats these as separate images even though the tags look similar, because the registry domain is part of the image's full name.

Managing this cache, listing what is stored, and cleaning up unused images is an important part of keeping your system healthy, especially on development machines where you experiment with many different images.

The Layered Structure of Images

An important property of Docker images is that they are built from layers. Each layer represents a snapshot of the filesystem after a particular step in the image creation process. Conceptually you can imagine a stack of read-only layers, with the container's writable layer on top when it runs.

At the lowest level, these layers are stored using content addressing. The actual filenames used by Docker are often based on cryptographic hashes of their content. This allows Docker to detect when two images share identical layers and then store only one copy physically, while referencing it from both images.

Because of this structure, images that share a common base can be very efficient. If you have several different versions of an application, all built from the same base operating system image, your disk only stores one copy of those shared base layers.

Layers are immutable and shared. If two images use the same layer, that layer is stored only once and is never modified, new changes always create new layers on top.

This layering and immutability model is what allows Docker to cache build steps efficiently and to pull only the parts of an image that you do not already have. Later chapters about image layers and caching will explore how to exploit this for faster builds and smaller images.

Relationship Between Images and Containers

The relationship between images and containers is similar to that between classes and objects in programming, or between templates and instances. The image contains the definition and content. The container is a running instance that uses that definition plus its own state.

When you create a container from an image, Docker uses several pieces of information from the image. These include default environment variables, the default command, exposure hints about ports, and the initial filesystem contents. You can override many of these at runtime with docker run options, but the image still provides the baseline.

Because each container adds only a thin writable layer and some metadata on top of the image, starting and stopping containers is relatively cheap. This is one reason containers spin up much faster than virtual machines which must boot a complete guest operating system every time.

Removing a container does not remove its image. The image remains in the local cache and you can start new containers from it at any time, unless you explicitly remove the image itself. This means you can experiment with containers freely without worrying about losing the underlying image.

Image Names, Tags, and Identifiers

Every image can be identified in several ways. The most visible is the human readable name with an optional tag, for example nginx:latest or python:3.12. The name indicates the repository, and the tag usually indicates some kind of version or variant.

Behind the scenes, each image also has a unique identifier, often abbreviated as an image ID. When you list images using Docker commands, you will see this ID. It is derived from the image's content and can be used directly to refer to the image in commands, although names and tags are more convenient for everyday use.

If you omit a tag in an image name, Docker automatically assumes the tag latest. This is just a convention and not a guarantee that you have the newest possible image. It is simply the name of a tag that maintainers often choose to point at a default version.

imagename and imagename:latest refer to the same thing, because Docker uses :latest as the default tag when you do not specify one.

There is also a more precise way to identify images using digests. Digests use a form like repository@sha256:.... They are content based and do not change even if a maintainer moves tags around. A later chapter will go deeper into tags, versions, and digests, and how to use them safely in real projects.

Obtaining Images from Registries

Images are usually stored and distributed by container registries. Docker Hub is the most widely used public registry and contains many official images for popular software. There are also private registries that companies run internally or as cloud services.

When you refer to an image that does not exist in your local cache, Docker automatically tries to pull it from a registry. If you do not specify a registry in the name, Docker assumes Docker Hub. If you specify a domain prefix in the image name, Docker uses that domain as the registry address, for example gcr.io, ghcr.io, or a custom company domain.

From a user's perspective, obtaining images is usually as simple as running a command that references them. Docker handles authentication if needed, downloads the required layers, verifies them, and stores everything in the local image cache. Later runs of the same image will usually start much faster because all layers are already present.

Reusing and Sharing Images

Once an image is available in your local environment, you can reuse it as often as you like. You can start multiple containers from the same image simultaneously, each with its own data and configuration, without increasing disk usage for the shared image content.

In team environments, you rarely pass around application code in its raw form for execution. Instead, you build an image once, push it to a registry, and then everyone on the team pulls the same image. This process ensures that the entire team runs the same binaries and dependencies, which reduces "works on my machine" problems.

Because images are portable and self-contained, they also serve as the unit of deployment. Continuous integration and delivery systems build images and push them to registries. Production servers or orchestrators then pull those images and run containers from them. This decouples building software from running it and keeps deployments repeatable.

Image Lifecycle and Cleanup

Over time, especially during experimentation, you will accumulate many images in your local cache. Some will be in regular use, others only used once during a test. Unused images take up disk space but otherwise do not interfere with Docker operation.

To manage this, Docker provides ways to inspect the images present and to remove those that are no longer needed. A typical workflow is to periodically list images, review which are actually in use, and delete obsolete ones. You can do this manually or rely on automated cleanup strategies.

It is important to distinguish between images that are referenced by existing containers and those that are not. Removing an image that is still used by a container is either prevented or requires a forced removal flag that also affects the containers. Removing containers that you no longer need frees Docker to consider their images as candidates for cleanup, if nothing else references them.

Understanding that images persist independently from containers, and that images themselves are built from sharable layers, prepares you for more advanced topics. You will later learn how to inspect images, optimize their size, and construct them from Dockerfiles in a way that is efficient and maintainable.

5.1 Pulling Images from Docker Hub

5.2 Listing and Removing Images

5.3 Image Layers Explained

5.4 Image Size Optimization

5.5 Inspecting Images