Kahibaro
Discord Login Register

6.5 Best Practices for Dockerfiles

Writing Clear and Maintainable Dockerfiles

Dockerfiles are executable documentation for how to build an image. Good Dockerfiles are predictable, readable, and efficient. In this chapter you focus on practices that make your Dockerfiles easier to understand, faster to build, and safer to use, without repeating the basics or syntax that are covered in other chapters.

Choose an Appropriate Base Image

A Dockerfile starts with a base image, and that single decision influences size, security, and performance. For development you might prefer a convenient base with many tools preinstalled. For production you usually want something smaller and more focused.

A practical approach is to start with a slightly larger, more comfortable base image while you experiment, and then later switch to a smaller variant of the same family when you know exactly what you need. For example, many official images offer variants like ubuntu, ubuntu:22.04, node:20, node:20-slim, or python:3.12-alpine. Moving from a generic base to a slim variant often removes unnecessary compilers, documentation, and utilities.

Avoid building on top of random images from untrusted sources. Official images and images from reputable publishers reduce the risk of hidden malicious content and lower the chance that your image will break due to unusual customizations in the base.

Always prefer small, trusted base images that contain only what your application truly needs.

Keep Images Small and Focused

A smaller image downloads faster, starts quicker, uses less storage, and has a lower attack surface. You keep images small by including only what is actually required at runtime.

Separate build tools from runtime dependencies. Compilers, debuggers, and test frameworks are often necessary while building the application, but they are not needed when the container runs in production. Remove these or use techniques that allow you to build in one environment and run in another, such as multi-stage builds which are covered in a later chapter.

Avoid installing full desktop environments, large utilities, or unused language runtimes. For language specific images, install just the libraries and modules your application imports. When using package managers, prefer runtime variants over development variants when you only need to execute code and not compile it.

Use Clear, Pinned, and Explicit Versions

Dockerfiles should produce the same result today and in the future. To make builds reproducible, be explicit about versions. Instead of relying on floating latest tags or unspecified versions of packages, specify a concrete version whenever feasible.

Pinning versions can happen at several levels. You can use a specific tag for the base image, such as python:3.12.1, instead of python:3 or python:latest. Inside the image, you can pin system packages to particular versions where your package manager supports it. For language level dependencies, specify exact versions in the appropriate files, for example in requirements.txt for Python or package.json and its lock file for Node.js.

Reproducible images are easier to debug. If an image suddenly behaves differently, you know it is because you changed a version, not because something silently shifted under you.

Avoid latest tags and unspecified versions in production Dockerfiles, because they break reproducibility and can introduce unexpected changes.

Order Instructions to Use Build Cache Effectively

Docker builds make heavy use of caching. When you rebuild an image, Docker reuses layers for instructions that have not changed and that depend on unchanged content. To take full advantage of this, you should order instructions from the least frequently changing content to the most frequently changing content.

For example, installation of system packages or language runtimes changes rarely, while application source code changes often. Place base configuration and dependency installation earlier in the file, and copy your frequently changing source code later. That way, modifying a source file does not force Docker to reinstall your packages again.

When you combine commands in a single instruction, be aware that any change inside that instruction invalidates the entire layer. Group commands that always change together, and keep more stable configuration steps in different instructions that can remain cached across many builds.

Place rarely changing instructions early in the Dockerfile and frequently changing instructions later, so Docker can reuse cached layers as much as possible.

Minimize the Number of Layers Without Sacrificing Clarity

Each Dockerfile instruction creates a new layer. While layers are useful for caching and sharing common parts between images, an excessive number of layers can add complexity. At the same time, combining everything into a single large instruction harms readability and makes caching less effective.

A balanced approach is to combine tightly related shell commands into one instruction, especially when they form a logical unit such as installing a group of system packages. This reduces intermediate image size and keeps related steps together. Yet, it is usually better to keep different responsibilities in separate instructions, for example one instruction for installing system packages and another for adding application dependencies.

Do not chase the smallest possible number of layers at the cost of a confusing Dockerfile. People, including you in the future, must understand and maintain it.

Clean Up After Installations

Temporary files and caches inside intermediate layers increase the final image size, because they become part of the layer history. Even if you delete those temporary files in a later instruction, the space they used still exists in earlier layers.

To avoid this, remove caches and temporary files within the same instruction that created them. For instance, if you install packages and they generate cache directories, clean the cache before the instruction finishes. For language package managers, look for options that reduce cache usage or produce smaller artifacts.

As a rule, do not leave build artifacts, test reports, or downloaded archives in the final image unless they are truly required at runtime.

Delete caches and temporary build files in the same instruction that created them, so they never persist in any layer.

Avoid Leaking Secrets into the Image

Dockerfiles can accidentally capture sensitive data such as passwords, tokens, or private keys. If you hard code credentials in an instruction, they are stored permanently in the image and its layers. Anyone who can pull the image can potentially extract them.

Never write secrets directly in the Dockerfile. Avoid copying configuration files that contain secrets into the image unless they are sanitized. Use environment variables, secret management features, or volume mounted files at runtime instead of baking sensitive data into the image itself.

Build arguments can also be dangerous if they are used to inject secrets into instructions, because their values may end up in layer history. Use build arguments only for non sensitive configuration values, and rely on dedicated secret mechanisms when available.

Do not store passwords, API keys, or private data in Dockerfiles, image layers, or build arguments. Provide secrets only at runtime.

Use Explicit Working Directories and Ownership

An image often contains many files in different places. To reduce confusion and make builds predictable, choose a clear working directory and declare it in the Dockerfile. This tells future readers where the application lives inside the container, and it helps you avoid accidentally mixing files in system paths.

If you copy application code into the image, ensure that permissions make sense for the user who will run the process. For security, running as a non root user is covered elsewhere, but the Dockerfile should support that pattern. Assign ownership appropriately when you add files, instead of relying on broad permissions.

Well structured directories and predictable locations make it easier to understand how the image is laid out and simplify debugging when something goes wrong.

Separate Build and Runtime Concerns

Dockerfiles become messy when they mix build and runtime responsibilities. A clear build stage is where you compile, test, or bundle the application. A clear runtime stage is where the compiled or packaged output is actually executed.

Even when you are not yet using the formal multi stage feature, you can still respect this separation conceptually. Collect build tools and compilation steps together, and then apply a distinct phase where you prepare the final runtime environment. This mindset makes it easier later to refactor into a multi stage build, and it keeps your Dockerfile logically organized.

In the runtime part, avoid leaving behind tools that are only useful for building. Each additional tool increases the size and security exposure of the final image.

Make the Default Command Predictable

A reader of your Dockerfile should be able to tell what happens when someone runs the container with no extra parameters. The default command should correspond to the main purpose of the image. For an application image, that is usually the application server or main service process.

Avoid chaining many responsibilities into a single default command. Each container should have one clear main process. Auxiliary functionality, such as running database migrations or one time initialization scripts, is often better triggered by separate commands or separate images rather than embedded in the default behavior.

If the image is meant for interactive use, such as a development or debugging environment, select a default command that drops the user into a helpful shell or tool instead of surprising them with an immediate exit.

Document Your Intent Inside the Dockerfile

While a Dockerfile is an executable specification, it also serves as documentation. Future maintainers should understand why each instruction exists, not just what it does. Well placed comments can explain non obvious choices, such as why you pinned a particular version, why you used a specific optimization, or why a less common tool is required.

Use comments sparingly and keep them up to date. When you change behavior, adjust comments so they do not become misleading. For simple self explanatory commands, comments are unnecessary. Reserve them for areas where the reasoning is likely to be forgotten.

A short introduction comment at the top of the Dockerfile describing the image’s role and main characteristics can also help new readers quickly grasp its purpose.

Prefer Simplicity Over Clever Tricks

It can be tempting to encode advanced shell logic into a single instruction, or to dynamically manipulate configuration at build time with complex scripts. While powerful, such tricks usually make Dockerfiles harder to understand and maintain. Since Dockerfiles are often read by people who are new to both Docker and shell scripting, clarity has higher value than cleverness.

If you need complex behavior, consider moving it into well named scripts that live alongside your application source. Then the Dockerfile can simply copy and invoke those scripts. This makes the Dockerfile smaller and delegates complexity to regular code, where you can test and document it more easily.

A good benchmark is whether someone with basic Docker knowledge can read your Dockerfile and roughly understand the sequence of events without external guidance. If not, it is usually worth simplifying or extracting logic elsewhere.

Validate and Test Your Image Regularly

An image that builds successfully is not necessarily correct. Integrate simple checks inside your workflow to confirm that the resulting image behaves as expected. This can involve running a minimal smoke test after the build completes, such as starting the container and hitting a health check endpoint or running a basic command inside it.

While full testing pipelines belong in continuous integration and are discussed in a different chapter, your Dockerfile should not assume that any successful build is good enough. Small verification steps catch obvious mistakes early, such as missing files, incorrect paths, or misconfigured default commands.

Regular testing also gives you confidence when you later refactor the Dockerfile, change the base image, or upgrade dependencies.

Do not treat a successful build as proof that the image works. Always run at least a minimal functional test after building.

Evolve Dockerfiles Incrementally

As your application grows, your Dockerfile should evolve with it. Avoid big, unreviewed rewrites of the Dockerfile. Instead, make focused changes that adjust one aspect at a time, such as switching the base image, reordering instructions for cache efficiency, or cleaning up dependencies.

Each change can then be tested and reviewed in isolation. This process reduces the risk that performance, size, or security regress silently while you refactor. Over time, these incremental improvements keep the Dockerfile aligned with current best practices and with the actual needs of your project.

By following these practices, you create Dockerfiles that are efficient, secure, and understandable. They become a reliable foundation for the images you build, whether for development, testing, or production environments, and they integrate smoothly with more advanced techniques you will encounter in later chapters.

Views: 7

Comments

Please login to add a comment.

Don't have an account? Register now!