Skip to main content

Lesson: Dockerfiles & Best Practices

What you'll learn

  • Write a Dockerfile using the core instructions (FROM, WORKDIR, COPY, RUN, ENV, EXPOSE, USER, CMD).
  • Order instructions to exploit layer caching for fast rebuilds.
  • Apply the lab's house conventions: multi-stage builds, a pinned base image, a non-root user, a .dockerignore, a HEALTHCHECK, and sha-based tags.
  • Build an image and push it to the lab's private registry at 10.100.100.6.

By the end you can turn source code into a small, secure, reproducible image and publish it where the team can pull it — the capstone build skill of this module.

The lesson

1. What a Dockerfile is

A Dockerfile is a plain-text recipe. Each line is an instruction that the build engine executes top-to-bottom, and most instructions add one layer to the image (recall layers from Chapter 1). Build it with docker build and you get a reusable image.

A first, naive Dockerfile for a Node.js app:

FROM node:20
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "server.js"]

It works — but it is large, runs as root, and rebuilds slowly. We will fix all three.

2. The core instructions

  • FROM — the base image you build on top of.
  • WORKDIR — sets (and creates) the working directory for later instructions.
  • COPY src dest — copies files from your build context into the image.
  • RUN — executes a command at build time (e.g. install dependencies), baking the result into a layer.
  • ENV — sets an environment variable.
  • EXPOSE — documents which port the app listens on (informational; you still publish with -p).
  • USER — sets which user later instructions and the final container run as.
  • CMD — the default command run when a container starts (runtime, not build time).

3. Layer caching — order matters

The engine caches each layer. On rebuild, it reuses cached layers until it hits the first instruction whose inputs changed; everything after that is rebuilt. So put things that change rarely (dependency installs) before things that change often (your source code).

Bad: COPY . . then npm install — any code edit busts the cache and reinstalls every dependency.

Good: copy only the dependency manifest, install, then copy the code:

COPY package.json package-lock.json ./
RUN npm ci            # cached unless package files change
COPY . .              # changes often, but install layer stays cached

Now editing server.js reuses the (slow) install layer and rebuilds in seconds.

4. Multi-stage builds (house convention)

Build tools (compilers, dev dependencies) are needed to build but are dead weight — and extra attack surface — in the final image. A multi-stage build uses one stage to build and a clean, slim stage to run, copying only the finished artifact across.

# ---- build stage ----
FROM node:20-bookworm-slim AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build          # produces /app/dist

# ---- runtime stage ----
FROM node:20-bookworm-slim AS runtime
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY package.json package-lock.json ./
RUN npm ci --omit=dev      # production deps only

The final image contains only what it needs to run.

5. Pinned base image + non-root user (house conventions)

Pin the base image so builds are reproducible. At minimum pin a specific tag (node:20-bookworm-slim); best practice is to pin by digest so the exact bytes never change:

FROM node:20-bookworm-slim@sha256:<REDACTED-digest>

Run as a non-root user. By default containers run as root; if an attacker breaks out, root inside can be dangerous. Create and switch to an unprivileged user:

RUN useradd --system --uid 10001 --no-create-home appuser
USER appuser

Everything after USER appuser runs unprivileged — a core lab requirement.

6. .dockerignore and HEALTHCHECK (house conventions)

A .dockerignore file (next to the Dockerfile) keeps junk and secrets out of the build context, making builds faster and images cleaner. Example:

.git
node_modules
*.log
.env
Dockerfile
README.md

A HEALTHCHECK tells the engine how to know the app is actually alive, not just running. Compose and orchestrators use it (Chapter 4's service_healthy):

HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
  CMD curl -fsS http://localhost:3000/health || exit 1

7. The full house-style Dockerfile

Putting the conventions together:

# ---- build ----
FROM node:20-bookworm-slim@sha256:<REDACTED-digest> AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# ---- runtime ----
FROM node:20-bookworm-slim@sha256:<REDACTED-digest> AS runtime
WORKDIR /app
ENV NODE_ENV=production
RUN useradd --system --uid 10001 --no-create-home appuser
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY --from=build /app/dist ./dist
USER appuser
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
  CMD curl -fsS http://localhost:3000/health || exit 1
CMD ["node", "dist/server.js"]
   source + Dockerfile  --docker build-->  image  --docker push-->  10.100.100.6
        (your code)                     (layers)                 (private registry)

8. Tagging with sha-based tags (house convention)

A tag names a build. The lab convention is sha-based tags — tag the image with the git commit SHA so every image traces back to exact source. Avoid latest (it silently changes and breaks reproducibility).

GIT_SHA=$(git rev-parse --short HEAD)   # e.g. abc1234
docker build -t 10.100.100.6/myorg/myapp:sha-$GIT_SHA .

The full image name is REGISTRY/namespace/name:tag -> 10.100.100.6/myorg/myapp:sha-abc1234.

9. Build and push to the lab registry

The lab registry at 10.100.100.6 (a registry:2 server) uses htpasswd auth, so you log in first:

# 1. Authenticate (enter your registry username/password when prompted)
docker login 10.100.100.6

# 2. Build with a sha-based tag
docker build -t 10.100.100.6/myorg/myapp:sha-abc1234 .

# 3. Push
docker push 10.100.100.6/myorg/myapp:sha-abc1234

# 4. Verify by pulling it back (optionally on another lab host)
docker pull 10.100.100.6/myorg/myapp:sha-abc1234

That pushed image is exactly what a compose.yaml (Chapter 4) or a Kubernetes cluster later pulls. You have now closed the loop: source -> image -> registry -> run anywhere.

Dig deeper

Search terms

  • dockerfile best practices small secure image
  • docker multi-stage build explained
  • dockerfile layer caching order instructions
  • dockerfile non-root user USER instruction
  • docker build tag push private registry
  • dockerfile healthcheck example

Check yourself

  1. Why copy package.json and run the install step before copying the rest of your source code?
  2. What does a multi-stage build give you, and what gets copied between stages?
  3. Name three of the lab's house image conventions and the risk each one reduces.
  4. Why prefer a sha-based tag over latest?
  5. Write the three commands to log in to the lab registry, build myorg/myapp with the tag sha-abc1234, and push it.