Lesson: Dockerfiles & Best Practices

What you'll learn

Write a Dockerfile using the core instructions (FROM, WORKDIR, COPY, RUN, ENV, EXPOSE, USER, CMD).
Order instructions to exploit layer caching for fast rebuilds.
Apply the lab's house conventions: multi-stage builds, a pinned base image, a non-root user, a .dockerignore, a HEALTHCHECK, and sha-based tags.
Build an image and push it to the lab's private registry at 10.100.100.6.

By the end you can turn source code into a small, secure, reproducible image and publish it where the team can pull it — the capstone build skill of this module.

The lesson

1. What a Dockerfile is

A Dockerfile is a plain-text recipe. Each line is an instruction that the build engine executes top-to-bottom, and most instructions add one layer to the image (recall layers from Chapter 1). Build it with docker build and you get a reusable image.

A first, naive Dockerfile for a Node.js app:

FROM node:20
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "server.js"]

It works — but it is large, runs as root, and rebuilds slowly. We will fix all three.

2. The core instructions

FROM — the base image you build on top of.
WORKDIR — sets (and creates) the working directory for later instructions.
COPY src dest — copies files from your build context into the image.
RUN — executes a command at build time (e.g. install dependencies), baking the result into a layer.
ENV — sets an environment variable.
EXPOSE — documents which port the app listens on (informational; you still publish with -p).
USER — sets which user later instructions and the final container run as.
CMD — the default command run when a container starts (runtime, not build time).

3. Layer caching — order matters

The engine caches each layer. On rebuild, it reuses cached layers until it hits the first instruction whose inputs changed; everything after that is rebuilt. So put things that change rarely (dependency installs) before things that change often (your source code).

Bad: COPY . . then npm install — any code edit busts the cache and reinstalls every dependency.

Good: copy only the dependency manifest, install, then copy the code:

COPY package.json package-lock.json ./
RUN npm ci            # cached unless package files change
COPY . .              # changes often, but install layer stays cached

Now editing server.js reuses the (slow) install layer and rebuilds in seconds.

4. Multi-stage builds (house convention)

Build tools (compilers, dev dependencies) are needed to build but are dead weight — and extra attack surface — in the final image. A multi-stage build uses one stage to build and a clean, slim stage to run, copying only the finished artifact across.

# ---- build stage ----
FROM node:20-bookworm-slim AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build          # produces /app/dist

# ---- runtime stage ----
FROM node:20-bookworm-slim AS runtime
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY package.json package-lock.json ./
RUN npm ci --omit=dev      # production deps only

The final image contains only what it needs to run.

5. Pinned base image + non-root user (house conventions)

Pin the base image so builds are reproducible. At minimum pin a specific tag (node:20-bookworm-slim); best practice is to pin by digest so the exact bytes never change:

FROM node:20-bookworm-slim@sha256:<REDACTED-digest>

Run as a non-root user. By default containers run as root; if an attacker breaks out, root inside can be dangerous. Create and switch to an unprivileged user:

RUN useradd --system --uid 10001 --no-create-home appuser
USER appuser

Everything after USER appuser runs unprivileged — a core lab requirement.

6. .dockerignore and HEALTHCHECK (house conventions)

A .dockerignore file (next to the Dockerfile) keeps junk and secrets out of the build context, making builds faster and images cleaner. Example:

.git
node_modules
*.log
.env
Dockerfile
README.md

A HEALTHCHECK tells the engine how to know the app is actually alive, not just running. Compose and orchestrators use it (Chapter 4's service_healthy):

HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
  CMD curl -fsS http://localhost:3000/health || exit 1

7. The full house-style Dockerfile

Putting the conventions together:

# ---- build ----
FROM node:20-bookworm-slim@sha256:<REDACTED-digest> AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# ---- runtime ----
FROM node:20-bookworm-slim@sha256:<REDACTED-digest> AS runtime
WORKDIR /app
ENV NODE_ENV=production
RUN useradd --system --uid 10001 --no-create-home appuser
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY --from=build /app/dist ./dist
USER appuser
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
  CMD curl -fsS http://localhost:3000/health || exit 1
CMD ["node", "dist/server.js"]

   source + Dockerfile  --docker build-->  image  --docker push-->  10.100.100.6
        (your code)                     (layers)                 (private registry)

8. Tagging with sha-based tags (house convention)

A tag names a build. The lab convention is sha-based tags — tag the image with the git commit SHA so every image traces back to exact source. Avoid latest (it silently changes and breaks reproducibility).

GIT_SHA=$(git rev-parse --short HEAD)   # e.g. abc1234
docker build -t 10.100.100.6/myorg/myapp:sha-$GIT_SHA .

The full image name is REGISTRY/namespace/name:tag -> 10.100.100.6/myorg/myapp:sha-abc1234.

9. Build and push to the lab registry

The lab registry at 10.100.100.6 (a registry:2 server) uses htpasswd auth, so you log in first:

# 1. Authenticate (enter your registry username/password when prompted)
docker login 10.100.100.6

# 2. Build with a sha-based tag
docker build -t 10.100.100.6/myorg/myapp:sha-abc1234 .

# 3. Push
docker push 10.100.100.6/myorg/myapp:sha-abc1234

# 4. Verify by pulling it back (optionally on another lab host)
docker pull 10.100.100.6/myorg/myapp:sha-abc1234

That pushed image is exactly what a compose.yaml (Chapter 4) or a Kubernetes cluster later pulls. You have now closed the loop: source -> image -> registry -> run anywhere.

Dig deeper

Search terms

dockerfile best practices small secure image
docker multi-stage build explained
dockerfile layer caching order instructions
dockerfile non-root user USER instruction
docker build tag push private registry
dockerfile healthcheck example

Check yourself

Why copy package.json and run the install step before copying the rest of your source code?
What does a multi-stage build give you, and what gets copied between stages?
Name three of the lab's house image conventions and the risk each one reduces.
Why prefer a sha-based tag over latest?
Write the three commands to log in to the lab registry, build myorg/myapp with the tag sha-abc1234, and push it.