Kahibaro
Discord Login Register

12.1 Reducing Image Size

Why Image Size Matters

Container images travel across networks, sit in registries, and are pulled by developers and servers many times per day. Every extra megabyte makes pulls slower, consumes more storage, and can increase attack surface. Smaller images usually start faster, are cheaper to store and transfer, and are simpler to keep secure.

Smaller images mean faster pulls, faster deployments, fewer vulnerabilities, and less disk usage.

In this chapter, you will focus on practical techniques to keep images small when you build them.

Choosing a Minimal Base Image

A base image provides the starting filesystem for your image. Many images are large mainly because the base image is large. For example, a full Linux distribution with many tools is bigger than a stripped down runtime focused only on what your app needs.

Languages and ecosystems typically offer different base image variants. For example, there might be a full variant with build tools, a slim variant with fewer utilities, and an even smaller one intended only for production. In many cases you can build on a larger base, then copy the result into a smaller runtime base for the final image. This is a key pattern when combined with multi stage builds, which are covered elsewhere in the course.

Always pick the smallest base image that still has everything your application truly needs at runtime.

If your app does not need shell access or debugging tools in production, avoid base images that bundle them. If you only need a language runtime and system libraries, pick a dedicated runtime image instead of a full operating system.

Being Selective About What You Install

Tools that you install inside the image with package managers or language specific dependency managers are another major source of bloat. Production images often contain build tools, compilers, documentation, and test utilities that are never used at runtime.

In many package managers you can configure installation to skip recommended extras, documentation, or debug symbols. You can also install only the specific packages and libraries that your application requires, instead of large meta packages that pull in many dependencies.

A good pattern is to separate build time dependencies from runtime dependencies. Install compilers and build tools only in a build stage, then copy the compiled artifacts or bundled app into a separate stage that does not contain these tools. Avoid leaving temporary build files inside the final image.

Keep production images free of compilers, build tools, test frameworks, and any dependencies that are not strictly necessary at runtime.

Cleaning Up After Installations

Package managers and build tools often create caches, temporary files, and intermediate artifacts. If you do not remove them, everything ends up stored in the image layers, which increases size.

When you install system packages, you can remove cached package lists and temporary directories immediately after installation. When you build your application, you can delete build artifacts and transient files that are not needed at runtime. In language ecosystems that cache dependencies, you can clean those caches before the image layer is finalized, so they are not included.

An important detail is that cleanup must happen in the same build step as the installation or build. If cleanup happens in a later instruction, the previous layer still contains the unwanted files, and the overall size does not shrink.

Perform installation and cleanup in the same Dockerfile instruction, otherwise the removed files still take space in earlier layers.

Reducing Files Included in the Image

Another source of extra size is files that are copied into the image unnecessarily. Project directories often contain documentation, tests, mock data, design assets, local databases, and editor or build caches. If you copy your whole project directory by default, all of these may end up in the image.

To prevent this, you should be deliberate about which paths are copied. Copy only the directories and files required by your application at runtime. Many build tools produce output into a specific folder, and you can copy only that folder instead of the entire project. You can also configure ignore rules that exclude certain file patterns from being sent to the build context.

The build context is the set of files that the Docker engine receives when building the image. Large build contexts slow down builds and can accidentally include secrets or local artifacts. Keeping the build context small indirectly helps keep the final image small, because it becomes natural to copy only what is needed during the build itself.

Limit the build context and COPY only what the application actually needs. Avoid copying tests, docs, assets, caches, and local data into the image.

Minimizing Language and Framework Overhead

Different languages and frameworks have their own typical sources of bloat inside images. For example, dependency trees often include development-only libraries, testing frameworks, and optional modules. Package managers usually provide a way to install only runtime dependencies and skip development extras.

In some ecosystems, you can compile or bundle your application into a single binary or minimized distribution. In that case, the final image can contain only that artifact and a runtime environment, without the package manager, source code, and development assets. Some languages provide static binaries that can run in extremely small images.

You can also remove unused plugins, example content, localization files for languages you do not need, and diagnostic tools that are not relevant in production. Small adjustments like this can cut many megabytes from the final image.

Use production or runtime only dependency installs, and where possible ship compiled or bundled artifacts instead of full source and tooling.

Avoiding Unnecessary Layers

Every instruction that creates a new layer can add to image size if it introduces files that are not removed in the same step. Combining related operations into a single instruction can reduce the number of layers that contain transient data.

If you need to perform several related steps, such as updating package metadata, installing packages, and cleaning caches, you should group them together so that caches and temporary files never persist into a committed layer. This technique reduces both the number of layers and the cumulative size of the image.

However, you should not group unrelated tasks just for the sake of having fewer layers, because that can make the Dockerfile harder to understand and cache reuse less efficient. Focus instead on grouping operations where temporary files are created and cleaned.

Group installation and cleanup in the same instruction so that temporary data never becomes part of a committed image layer.

Using Multi Stage Builds for Smaller Final Images

Multi stage builds allow you to keep build tools and intermediate files separate from the final runtime image. You can have one stage that contains everything needed for compilation, bundling, or testing, and a second stage that contains only the compiled or packaged application and the minimal runtime.

In this pattern, the heavy build stage is never pushed to a registry or used in production. Only the small final stage produces the image that is tagged and deployed. This approach works well for applications in languages that need compilation, as well as for front end builds or any process that generates build artifacts.

By carefully choosing what to copy from the build stage into the final stage, you can eliminate source files, tests, build caches, and tooling. The result is a compact image that starts quickly and contains only what you intend to run.

Use a separate build stage and copy only the built artifacts into a minimal runtime stage to keep production images as small as possible.

Verifying and Iterating on Image Size

To know whether your optimizations are effective, you need to inspect and compare image sizes. After each significant change, you can check the size of the resulting image and the sizes of individual layers. By comparing different versions, you can see which modifications give real savings and which have little effect.

Over time, dependencies and base images may grow. Periodically reviewing image size helps you detect unexpected increases and take corrective actions, such as updating to a slimmer base or pruning obsolete components. Image size optimization is not a one time activity. It is part of maintaining a clean and efficient containerized application.

Regularly inspect image and layer sizes, and treat unexpected growth as a sign that your Dockerfile or dependencies need review.

Views: 5

Comments

Please login to add a comment.

Don't have an account? Register now!