Why Your Docker Build is So Slow: A Deep Dive into Multistage Dockerfile Optimization

There's a special kind of frustration that hits when you're waiting for a Docker build. You've made a one-line code change, pushed your commit, and now you're watching your CI pipeline crawl through a 15-minute build process. Again. For the fifth time today.

If this sounds familiar, you're not alone. Slow Docker builds are one of the most common—and most overlooked—bottlenecks in modern software development. They silently drain productivity, extend feedback loops, and turn quick fixes into hour-long ordeals.

But here's the thing: most slow Docker builds are slow by accident, not necessity. With the right techniques, you can often reduce build times by 80-95%, turning that 15-minute build into something that completes in under a minute.

In this comprehensive guide, we'll dive deep into the mechanics of Docker builds, understand why they become slow, and master the techniques that top DevOps engineers use to create lightning-fast container images.

The Hidden Cost of Slow Builds

Before we dive into solutions, let's understand why this matters so much.

The Mathematics of Build Wait Time

Consider a typical development team:

10 developers making 5 commits each per day
Each commit triggers a 15-minute build
That's 750 minutes (12.5 hours) of collective wait time per day
Over a year (250 working days), that's 3,125 hours—equivalent to 1.5 full-time engineers doing nothing but waiting

Now imagine cutting that build time to 2 minutes. Those 3,125 hours become 417 hours. You've effectively "hired" an extra engineer without spending a dime on recruitment.

The Cognitive Cost

There's also a psychological toll. Studies in developer productivity show that context-switching has a significant cost—it takes an average of 23 minutes to fully refocus after an interruption. A 15-minute build is long enough to break focus but short enough that developers feel they should wait rather than start something new.

The result? Developers end up in a limbo state: checking Slack, browsing Reddit, doing anything except productive work.

Understanding Docker's Layer Cache: The Foundation

Before you can optimize Docker builds, you need to understand how Docker caches work at a fundamental level.

How Docker Builds Work

When you run docker build, Docker doesn't execute your Dockerfile from scratch every time. Instead, it:

Reads each instruction in order (FROM, RUN, COPY, etc.)
Checks if it has a cached layer from a previous build
If the cache is valid, reuses it; otherwise, executes the instruction
Once cache is invalidated at one layer, all subsequent layers must be rebuilt

This last point is critical. Docker's cache is a linear chain—if you break it at step 3, steps 4, 5, 6, and beyond must all be rebuilt, even if those instructions haven't changed.

What Invalidates Cache?

Different instructions have different cache invalidation rules:

Instruction	Cache Invalidated When
`RUN`	The command string changes
`COPY`	Any source file's content or metadata changes
`ADD`	Same as COPY, plus URL content changes
`ARG`	The argument value changes
`ENV`	The environment value changes

The COPY instruction is often the biggest culprit. If you COPY . . at the beginning of your Dockerfile, any change to any file in your project invalidates the cache for everything that follows.

The Seven Deadly Sins of Dockerfile Design

Let's examine the most common anti-patterns that slow down builds.

Sin #1: Copying Everything Too Early

# ❌ BAD: The Classic Mistake
FROM node:20
WORKDIR /app
COPY . .                    # Cache busted on ANY file change
RUN npm install             # Reinstalls ALL packages every time
RUN npm run build

This pattern means that changing a single character in your README.md triggers a complete npm install. With modern JavaScript projects having hundreds of dependencies, that's easily 2-5 minutes wasted.

# ✅ GOOD: Strategic Copying
FROM node:20
WORKDIR /app
COPY package*.json ./       # Only copy dependency manifests
RUN npm install             # Cached unless dependencies change
COPY . .                    # Now copy everything else
RUN npm run build

Sin #2: Installing Dependencies Every Build

Even with correct copy ordering, you might be reinstalling dependencies unnecessarily:

# ❌ BAD: Dev dependencies in production
FROM node:20
WORKDIR /app
COPY package*.json ./
RUN npm install             # Installs devDependencies too
COPY . .
RUN npm run build
CMD ["node", "dist/server.js"]

This image contains all your devDependencies (TypeScript, ESLint, testing frameworks) that will never be used at runtime.

# ✅ GOOD: Multistage build
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci                  # Faster, deterministic installs
COPY . .
RUN npm run build

FROM node:20-slim AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/server.js"]

The production image is now smaller AND builds faster because subsequent builds can cache the dependency installation step independently.

Sin #3: Ignoring .dockerignore

Without a proper .dockerignore, you're sending unnecessary files to the Docker daemon:

# .dockerignore
node_modules
.git
.gitignore
*.md
.env*
.vscode
coverage
.nyc_output
dist
*.log
.DS_Store
Thumbs.db

Key files to always ignore:

node_modules: You're reinstalling anyway, and sending 500MB+ to the daemon is slow
.git: Your entire repository history, often larger than your codebase
Test artifacts: coverage reports, snapshots, mock data

Sin #4: Large Base Images

# ❌ BAD: Full-fat images
FROM node:20              # 1.1GB
FROM python:3.11          # 1.0GB
FROM ubuntu:22.04         # 77MB (but you'll install a lot)

# ✅ GOOD: Slim and alpine variants
FROM node:20-slim         # 240MB
FROM node:20-alpine       # 140MB
FROM python:3.11-slim     # 150MB
FROM python:3.11-alpine   # 52MB

However, be careful with alpine—it uses musl libc instead of glibc, which can cause issues with some native modules. Test thoroughly.

Sin #5: Running Updates Separately

# ❌ BAD: Extra layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN apt-get clean

# ✅ GOOD: Single layer, cleanup included
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        curl \
        git \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

Combining commands reduces layers and ensures cleanup happens in the same layer as installation (otherwise, deleted files still exist in previous layers, bloating the image).

Sin #6: Not Using BuildKit

BuildKit is Docker's next-generation builder, and it's been the default since Docker 23.0. But many systems still run older versions or have it disabled.

Enable it explicitly:

# Environment variable
export DOCKER_BUILDKIT=1
docker build .

# Or in daemon.json
{
  "features": {
    "buildkit": true
  }
}

BuildKit provides:

Parallel builds: Independent stages build concurrently
Better caching: More intelligent cache invalidation
Build secrets: Securely pass secrets without baking them into layers
Cache mounts: Persistent caches between builds (game-changer!)

Sin #7: Not Leveraging Build Arguments for Caching

# ❌ BAD: Always downloads latest
RUN curl -L https://github.com/project/release/latest/download/binary -o /usr/bin/binary

# ✅ GOOD: Explicit version enables caching
ARG BINARY_VERSION=1.2.3
RUN curl -L https://github.com/project/release/download/v${BINARY_VERSION}/binary -o /usr/bin/binary

With explicit versions, Docker can cache the download. The "latest" approach forces a re-download every build.

Advanced Optimization: BuildKit Cache Mounts

This is where builds go from "optimized" to "blazingly fast." BuildKit cache mounts allow you to persist caches between builds.

Package Manager Caches

# Node.js with npm cache
FROM node:20
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci

# Go with module cache
FROM golang:1.21
WORKDIR /app
COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod \
    go mod download

# Python with pip cache
FROM python:3.11
WORKDIR /app
COPY requirements.txt ./
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

# Rust with cargo cache
FROM rust:1.73
WORKDIR /app
COPY Cargo.toml Cargo.lock ./
RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/app/target \
    cargo build --release

The first build populates the cache. Subsequent builds reuse it, dramatically reducing download and compilation times.

Compilation Caches

Some languages benefit from compiler caches:

# C/C++ with ccache
FROM gcc:12
RUN apt-get update && apt-get install -y ccache
ENV PATH="/usr/lib/ccache:${PATH}"
WORKDIR /app
COPY . .
RUN --mount=type=cache,target=/root/.ccache \
    make -j$(nproc)

Sharing Caches Across Builds

By default, cache mounts are scoped to the build. You can share them:

RUN --mount=type=cache,target=/root/.npm,id=npm-cache,sharing=shared \
    npm ci

The sharing=shared allows concurrent builds to read (and even write, carefully) to the same cache.

Multistage Builds: The Architecture

Multistage builds aren't just about smaller images—they're about parallelization and cache isolation.

The Parallel Build Pattern

# Stage 1: Install dependencies
FROM node:20 AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci

# Stage 2: Build the application (depends on deps)
FROM deps AS builder
COPY . .
RUN npm run build

# Stage 3: Run tests (depends on deps, parallel with builder)
FROM deps AS tester
COPY . .
RUN npm test

# Stage 4: Lint check (depends on deps, parallel with builder and tester)
FROM deps AS linter
COPY . .
RUN npm run lint

# Final: Production image
FROM node:20-slim AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=deps /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]

With BuildKit, builder, tester, and linter stages run in parallel because they all depend only on deps. This can cut build times significantly on multi-core machines.

Conditional Builds

ARG BUILD_ENV=production

FROM node:20 AS base
WORKDIR /app
COPY package*.json ./
RUN npm ci

FROM base AS development
COPY . .
CMD ["npm", "run", "dev"]

FROM base AS production-builder
COPY . .
RUN npm run build

FROM node:20-slim AS production
WORKDIR /app
COPY --from=production-builder /app/dist ./dist
COPY --from=production-builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]

Build with:

docker build --target development -t myapp:dev .
docker build --target production -t myapp:prod .

CI/CD-Specific Optimizations

GitHub Actions with Cache

name: Build

on: push

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      
      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: myapp:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

The type=gha cache stores build layers in GitHub Actions' cache, persisting them between runs.

GitLab CI with Registry Cache

build:
  stage: build
  image: docker:24
  services:
    - docker:dind
  variables:
    DOCKER_BUILDKIT: 1
  script:
    - docker build 
        --cache-from $CI_REGISTRY_IMAGE:cache 
        --build-arg BUILDKIT_INLINE_CACHE=1 
        -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA 
        -t $CI_REGISTRY_IMAGE:cache 
        .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
    - docker push $CI_REGISTRY_IMAGE:cache

The --cache-from pulls cached layers from the registry, and BUILDKIT_INLINE_CACHE=1 embeds cache metadata in the pushed image.

Using a Build Cache Service

For larger teams, consider dedicated cache services:

# Using BuildKit's remote cache backend
docker buildx build \
  --cache-from type=registry,ref=myregistry/myapp:buildcache \
  --cache-to type=registry,ref=myregistry/myapp:buildcache,mode=max \
  -t myapp:latest .

Or use a specialized service like Depot, which provides cloud-hosted builders with persistent caches:

# .depot.json
{
  "id": "your-project-id"
}

depot build -t myapp:latest .

Language-Specific Optimization Strategies

Node.js / JavaScript

FROM node:20-alpine AS deps
WORKDIR /app

# Use npm ci for deterministic installs
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci --prefer-offline

FROM deps AS builder
COPY . .

# Cache Next.js build artifacts
RUN --mount=type=cache,target=/app/.next/cache \
    npm run build

FROM node:20-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production

# Only copy necessary files
COPY --from=builder /app/public ./public
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static

CMD ["node", "server.js"]

For Next.js specifically, the .next/cache contains build artifacts that dramatically speed up rebuilds.

Python

FROM python:3.11-slim AS builder

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Create virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install --no-compile -r requirements.txt

FROM python:3.11-slim AS runtime
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

WORKDIR /app
COPY . .

CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0"]

The virtual environment is created in the builder and copied to runtime, excluding build tools.

Go

Go excels at producing small, static binaries:

FROM golang:1.21-alpine AS builder

WORKDIR /app

# Download dependencies separately for caching
COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod \
    go mod download

COPY . .
RUN --mount=type=cache,target=/root/.cache/go-build \
    CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /app/server .

# Minimal runtime - just the binary
FROM scratch
COPY --from=builder /app/server /server
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

ENTRYPOINT ["/server"]

The final image is often under 20MB—just your binary and TLS certificates.

Rust

FROM rust:1.73 AS planner
WORKDIR /app
RUN cargo install cargo-chef
COPY . .
RUN cargo chef prepare --recipe-path recipe.json

FROM rust:1.73 AS cacher
WORKDIR /app
RUN cargo install cargo-chef
COPY --from=planner /app/recipe.json recipe.json
RUN --mount=type=cache,target=/usr/local/cargo/registry \
    --mount=type=cache,target=/app/target \
    cargo chef cook --release --recipe-path recipe.json

FROM rust:1.73 AS builder
WORKDIR /app
COPY --from=cacher /app/target target
COPY --from=cacher /usr/local/cargo /usr/local/cargo
COPY . .
RUN cargo build --release

FROM debian:bookworm-slim AS runtime
COPY --from=builder /app/target/release/myapp /usr/local/bin/
CMD ["myapp"]

This uses cargo-chef to separately compile dependencies, enabling excellent caching.

Measuring Build Performance

You can't optimize what you can't measure. Here's how to profile your builds:

BuildKit Timing

# Enable BuildKit with timing output
DOCKER_BUILDKIT=1 docker build --progress=plain . 2>&1 | tee build.log

# Parse timing from logs
grep "DONE\|CACHED" build.log

The docker history Command

docker history myapp:latest --format "{{.Size}}\t{{.CreatedBy}}"

This shows the size and command for each layer, helping identify bloated steps.

Build Time Benchmarking Script

#!/bin/bash

# benchmark-build.sh
iterations=5

echo "Benchmarking Docker build..."

# Clear cache
docker builder prune -f >/dev/null 2>&1

# Cold build (no cache)
start=$(date +%s.%N)
docker build -t myapp:bench . >/dev/null 2>&1
end=$(date +%s.%N)
cold_time=$(echo "$end - $start" | bc)
echo "Cold build: ${cold_time}s"

# Warm builds
total=0
for i in $(seq 1 $iterations); do
  start=$(date +%s.%N)
  docker build -t myapp:bench . >/dev/null 2>&1
  end=$(date +%s.%N)
  duration=$(echo "$end - $start" | bc)
  total=$(echo "$total + $duration" | bc)
  echo "Warm build $i: ${duration}s"
done

avg=$(echo "scale=2; $total / $iterations" | bc)
echo "Average warm build: ${avg}s"

Real-World Case Study: From 14 Minutes to 47 Seconds

Let's walk through a real optimization journey.

The Starting Point

A monorepo React application with this Dockerfile:

FROM node:18
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
EXPOSE 3000
CMD ["npx", "serve", "-s", "build"]

Build time: 14 minutes 23 seconds (avg. over 10 builds)

Problems:

Full COPY before npm install
No .dockerignore
Full node image (1.1GB)
No multistage (dev dependencies in production)

Optimization 1: Add .dockerignore and Order Copies

FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
...

# .dockerignore
node_modules
.git
build
coverage

Build time: 6 minutes 12 seconds (57% reduction)

Warm build (source-only changes): 2 minutes 8 seconds

Optimization 2: Multistage Build

FROM node:18-slim AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci

FROM deps AS builder
COPY . .
RUN npm run build

FROM node:18-slim AS runner
WORKDIR /app
RUN npm install -g serve
COPY --from=builder /app/build ./build
EXPOSE 3000
CMD ["serve", "-s", "build"]

Build time: 4 minutes 45 seconds (67% reduction from original)

Image size: Reduced from 1.8GB to 450MB

Optimization 3: BuildKit Cache Mounts

FROM node:18-slim AS deps
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci

FROM deps AS builder
COPY . .
RUN npm run build

FROM nginx:alpine AS runner
COPY --from=builder /app/build /usr/share/nginx/html
EXPOSE 80

Build time: 1 minute 52 seconds (87% reduction)

Warm build: 47 seconds (97% reduction!)

Final Results

Metric	Before	After	Improvement
Cold build	14m 23s	1m 52s	87% faster
Warm build	14m 23s	47s	97% faster
Image size	1.8GB	42MB	98% smaller
CI cost (monthly)	~$180	~$12	93% savings

Common Pitfalls and Troubleshooting

"Why Isn't My Cache Being Used?"

Different host context: CI runners are ephemeral; use registry-based caching
BuildKit mode mismatch: Legacy builder and BuildKit have different caches
Argument changes: ARG values affect cache keys for subsequent layers
Time-based commands: RUN date or similar always invalidates

# ❌ This invalidates cache every build
RUN echo "Built at $(date)" > /build-info.txt

# ✅ Use build args for version info
ARG BUILD_TIME
RUN echo "Built at $BUILD_TIME" > /build-info.txt

"My Multistage Build Is Slower"

Ensure BuildKit is enabled—the legacy builder doesn't parallelize stages:

# Check BuildKit status
docker info | grep buildkit

"Cache Mounts Aren't Working in CI"

Persistent mounts are local to the builder. Use external cache:

- name: Build
  run: |
    docker buildx build \
      --cache-from type=gha \
      --cache-to type=gha,mode=max \
      -t myapp .

The Optimization Checklist

Before you finish reading, here's a checklist to audit your Dockerfiles:

Conclusion

Docker build optimization is one of those areas where a little knowledge goes a long way. The techniques in this guide—proper layer ordering, multistage builds, BuildKit cache mounts, and CI/CD-specific caching—can transform sluggish 15-minute builds into sub-minute operations.

The impact extends beyond just saving time. Faster builds mean:

Shorter feedback loops: Developers get faster CI results
Lower infrastructure costs: Less compute time per build
Better developer experience: Less waiting, more shipping
More deployments: Teams deploy more frequently when builds are fast

Start with the low-hanging fruit—.dockerignore, copy ordering, and enabling BuildKit. Then progressively add multistage builds and cache mounts. Measure each change. You might be surprised how much speed was hiding in your Dockerfile all along.

The build isn't done until it's fast.