6.6 Multi-Stage Builds

Table of Contents

Why Multi Stage Builds Exist

When you start building images for real applications, you quickly face a trade off. You need compilers, build tools, and package managers to create your app, but you do not want all those tools inside the final image that you ship and run. They make images large, slow to pull, and potentially less secure.

Multi stage builds solve this by letting you use one image to build your application, and a separate, smaller image to run it. Both are defined in a single Dockerfile, and Docker can copy only the final build artifacts from one stage to another.

Key idea: Multi stage builds separate the “build environment” from the “runtime environment” inside one Dockerfile, and only the minimal artifacts are copied into the final image.

How Multi Stage Builds Work

A multi stage build is just a Dockerfile that contains more than one FROM instruction. Each FROM starts a new stage. You can give each stage a name and later refer to it when you copy files between stages.

Only the last stage produces the final image that docker build returns. All previous stages are used during the build but do not appear as separate images unless you explicitly target them.

Conceptually, you can think of it like this. You have several temporary workspaces during the build, and at the end you keep only one final workspace as the resulting image.

Defining Multiple Stages with `FROM`

In a single stage Dockerfile you typically have a single FROM at the top. In a multi stage build you can have many. Each FROM resets the build context for that stage. Instructions belong to the most recent FROM.

You can write something like:

A first FROM that uses a heavy base image containing compilers and tools.
A second FROM that uses a slim base image which is intended to run your app.
Optionally, additional stages for testing, asset generation, or other tasks.

Each stage is independent in terms of its filesystem, but stages can share artifacts via COPY instructions that refer to earlier stages.

Naming Stages with `AS`

To make stages easier to reference, you can give them names using the AS keyword in the FROM line. This avoids relying on the implicit numbering of stages.

For example, consider a Node.js application build. You might write something like:

First stage, called builder, that installs dependencies and builds the application.

Second stage, called runtime, that contains only the files needed to run.

The syntax pattern is:

FROM some-base-image AS stage_name

Later you can use stage_name in a COPY instruction in another stage. Named stages are especially helpful when you have more than two stages or when you want to be explicit about what you copy from where.

Rule: Use AS name with FROM to define a stage name, then refer to that name in later stages for clarity and control.

Copying Artifacts Between Stages

The special part of multi stage builds is the ability to copy files from one stage into another without writing them to your host. This uses a variant of the COPY instruction that includes a --from flag.

Conceptually, the pattern looks like this:

You have already defined an earlier stage, for example FROM build-image AS builder.

In a later stage you write something like COPY --from=builder /path/in/builder /path/in/final.

Docker then takes files out of the filesystem of the builder stage and puts them into the filesystem of the current stage. It does not execute the builder stage again at that time. The copying happens as part of the image layer creation for the final stage.

You can copy entire directories, single files, or any subset of the build output that you need. This is how you avoid bringing the entire build environment along, and instead, only bring the compiled binaries, bundled scripts, or static assets.

Important: COPY --from=<stage> lets you bring only the chosen files from a previous stage into the final image, without including that stage’s tools or dependencies.

A Simple Two Stage Build Pattern

The most common pattern is a two stage build. Even though this course does not focus on specific languages in this chapter, the structure tends to look very similar regardless of the technology.

The first stage is a build stage. It uses a base image that has all tools required to compile, bundle, or otherwise prepare your application. In this stage you run your heavy RUN commands, such as installing dependencies or running build scripts.

The second stage is a runtime stage. It uses a smaller, usually more secure base image that has only what is necessary to run the already built application. In this stage you only copy artifacts from the first stage and define the final startup command with something like CMD or ENTRYPOINT.

This pattern is particularly effective when the language or framework has a separate build step that produces self contained output, such as a compiled binary or built static files. The first stage is then a temporary environment to produce those files, and the second stage is a clean container to serve them.

Reducing Image Size with Multi Stage Builds

One major benefit of multi stage builds is the reduction in image size. Build tools, test frameworks, debug symbols, and headers are usually not required to run the application. In a single stage Dockerfile these all end up in the final image by default.

By separating the build and runtime, and by copying only the necessary output, you can often shrink your image significantly. This leads to faster image pulls, quicker deployments, and less storage usage in registries.

You can also combine multi stage builds with a choice of smaller base images. For example, use a full featured image in the builder stage to get access to compilers and then choose a minimal runtime image in the final stage.

Guideline: Put heavy build dependencies only in builder stages and never in the final runtime stage to keep your images small and efficient.

Security and Isolation Benefits

Beyond size, multi stage builds also help tighten security. Every extra tool or package in your final image is another potential source of vulnerabilities. If you install compilers, shells, or debugging tools in the image that is deployed to production, they can be misused if the container is compromised.

With multi stage builds, the runtime stage can be kept lean. It can exclude build tools, test utilities, and even package managers if they are not needed at runtime. A slimmer image reduces the attack surface and simplifies security scanning.

Multi stage builds also encourage clearer separation of responsibilities. One stage is concerned with building and testing, another is concerned with running. This separation mirrors common security and operational practices.

Targeting Specific Stages

Docker lets you build a particular stage from a multi stage Dockerfile without building all the later stages. This can be useful for debugging the build process or for quickly obtaining an intermediate image that contains build tools.

Internally, each FROM creates a build stage that can be referred to by index or by name. When you run a targeted build, you are instructing Docker to stop after producing that stage’s image, instead of continuing to the final stage.

This feature is helpful in workflows where you want to inspect the build stage, run extra tests interactively, or cache intermediate images in your environment. Although the exact command use is discussed elsewhere, it is important in this chapter to understand that multi stage builds do not force you to always produce only the last stage. You have flexibility to use any stage for development and debugging while still shipping only a small final image.

Combining Multi Stage Builds with Other Best Practices

Multi stage builds do not replace other Dockerfile good practices. Instead, they complement them. Layer ordering, effective use of caching, and minimal base images still matter and can be combined with multi stage techniques.

For example, you might structure your Dockerfile so that dependency installation in the build stage benefits from Docker’s layer cache, even while you copy only the final artifacts into the runtime stage. You can also have more than two stages, with separate ones for unit tests, integration tests, or asset optimization.

The key is to think of the Dockerfile as a build pipeline, all within one file. Each stage has a clear job, and artifacts flow from one stage to another through controlled COPY instructions. This structured approach leads to cleaner, smaller, and more maintainable images.

Comments

Please login to add a comment.

Don't have an account? Register now!