Your Node.js Docker image is 1.8GB. In a Kubernetes cluster autoscaling under load, new pods take 4 minutes to start — 3.5 of which are pulling the image. Your autoscaler is essentially useless. Here's how to get the same application under 200MB with three changes.
Why Images Get So Large
Most bloated production images share three causes:
- Wrong base image —
node:20pulls Ubuntu + every Node.js build tool, not just the runtime - Dev dependencies included — build tools, test frameworks, and compilers left in the final image
- No .dockerignore —
COPY . .includesnode_modules,.git, test files, and local configs
A typical Node.js project on node:20:
| Layer | Size |
|---|---|
node:20 base image |
1.1GB |
App dependencies (npm install) |
200-400MB |
| Source code + test files | 50-100MB |
| Total | ~1.4-1.6GB |
Switch to node:20-alpine:
| Layer | Size |
|---|---|
node:20-alpine base image |
175MB |
| App dependencies (prod only) | 50-150MB |
| Source code | 5-20MB |
| Total | ~250MB |
One line change to your Dockerfile. That's a 6× reduction before any other optimization.
Multi-Stage Builds: The Primary Tool
Multi-stage builds let you use a full build environment and copy only the runtime output to a minimal final image. This is the single most impactful optimization for compiled languages and apps with heavy build tooling.
Before (1.8GB):
FROM node:20
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
CMD ["node", "dist/server.js"]
After multi-stage (180MB):
# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --include=dev
COPY . .
RUN npm run build
# Stage 2: Runtime (only the output)
FROM node:20-alpine AS runtime
WORKDIR /app
ENV NODE_ENV=production
COPY package*.json ./
RUN npm ci --omit=dev --ignore-scripts
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/server.js"]
The --from=builder instruction copies the compiled output from Stage 1. The final image contains: Alpine Linux (175MB) + production npm dependencies (no devDependencies) + compiled code. Everything used only for building — TypeScript compiler, test frameworks, build tools — is discarded.
Concrete numbers for a typical Express + TypeScript app:
| Build approach | Image size |
|---|---|
node:20, all deps |
1.8GB |
node:20-alpine, all deps |
850MB |
node:20-alpine, prod deps only |
320MB |
| Multi-stage, alpine, prod deps only | 180MB |
Alpine vs. Distroless
Alpine Linux (node:20-alpine): 175MB base. Minimal Linux with musl libc, busybox shell, and apk package manager. Shell access available for debugging. Occasional compatibility issues with native Node modules that assume glibc (use node:20-alpine3.18 specifically to pin the musl version).
Google Distroless (gcr.io/distroless/nodejs20-debian12): 75MB base. Contains only the Node.js runtime and its dependencies — no shell, no package manager, no OS utilities. Can't exec into the container for debugging. Much smaller attack surface.
Use Alpine when you need shell access for debugging or run maintenance scripts inside the container. Use distroless for production containers handling sensitive workloads where the reduced attack surface justifies the debugging limitations.
For Python applications:
python:3.12base: 920MBpython:3.12-slim: 150MBpython:3.12-alpine: 55MBgcr.io/distroless/python3-debian12: 52MB
Layer Caching Strategy
Docker caches each layer (RUN/COPY/ADD instruction) and skips rebuilding unchanged layers. The order of your Dockerfile determines how often the cache is invalidated.
Bad order (cache miss on every code change):
FROM node:20-alpine
WORKDIR /app
COPY . . # Cache busted on any file change
RUN npm install # Re-runs on every code change
RUN npm run build
Good order (dependencies cached separately from code):
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./ # Only changes when deps change
RUN npm ci # Cached until package.json changes
COPY . . # Changes on every code edit
RUN npm run build
With the correct order, npm ci runs only when package.json or package-lock.json changes — not on every code edit. For a project with 300 dependencies, that saves 2-3 minutes per build.
General rule: Copy dependency manifests first, install dependencies, then copy source code. This maximizes cache hit rate.
.dockerignore: The Overlooked Optimization
Without a .dockerignore, COPY . . sends everything to the Docker daemon — including your local node_modules (which will be overwritten by npm install anyway), .git directory, test files, and local environment files.
Minimum .dockerignore for a Node.js project:
node_modules
.git
.gitignore
*.md
.env
.env.*
.DS_Store
dist
build
coverage
*.log
.nyc_output
tests
__tests__
**/*.spec.ts
**/*.test.ts
Excluding node_modules alone typically reduces the build context sent to the daemon from 500MB+ to under 10MB. This matters for remote Docker builders and CI/CD pipelines where large build contexts add significant overhead.
Verifying Your Optimization Results
After applying these techniques, use docker image ls to check compressed size, and docker history your-image:tag to see the size contribution of each layer. The history output shows which RUN commands produce the largest layers — useful for identifying packages to move to a build stage.
For a more detailed breakdown, docker image inspect your-image:tag | jq '.[0].Size' returns the uncompressed size in bytes. The compressed registry size (what you see when pulling) is typically 40-60% of the uncompressed size for application images.
When Not to Optimize
Optimization takes time and adds Dockerfile complexity. Prioritize it when:
- Images exceed 500MB and are deployed to auto-scaling infrastructure where pull time affects startup
- Images are built and pushed in CI/CD pipelines where smaller images reduce build cache costs
- Container registries charge for storage and your organization has many images
For local development images, team tooling images, and one-off scripts — don't bother. A 1.5GB dev image that builds in 3 minutes and runs locally is fine. Spend your time on features, not image size, unless image size is measurably impacting production reliability.
Python and Go Specifics
Python: Switch from python:3.12 to python:3.12-slim first (920MB → 150MB with one line change). For production, use multi-stage: compile with python:3.12-slim in a build stage, run with gcr.io/distroless/python3-debian12 (52MB). Use pip install --no-cache-dir to avoid caching pip downloads in the layer.
Go: Go compiles to static binaries by default. With CGO_ENABLED=0 GOOS=linux go build, the binary has no external dependencies and can run in a scratch (empty) image:
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o server .
FROM scratch
COPY --from=builder /app/server /server
CMD ["/server"]
A Go application in a scratch image is 10-30MB — just the compiled binary. This is the smallest possible Docker image for any application.
Use the Docker Image Size Calculator to estimate your target image size based on base image selection and installed package count before committing to a build strategy.
Docker Image Size Calculator
Estimate compressed Docker image size based on base image and installed packages.