Self-hosted Private Registry

A private container registry with a web UI, authentication, and a weekly garbage-collection routine that uses a read-only maintenance window.

Why run your own registry

The CI runner builds images; they have to live somewhere the cluster can pull them. You could push to a public registry, but a self-hosted one keeps the whole artifact flow inside the lab — and it's a great thing to understand, because a registry is conceptually simple and operationally has a couple of sharp edges worth meeting.

This lab runs the CNCF Distribution registry (the registry:2 image) on Registry-Server (10.100.100.6), with a web UI alongside it.

build runner --push--> registry.example.com --pull--> Kubernetes
                          (10.100.100.6)

Why we use this: keeping images in-house closes the supply-chain loop — every artifact the cluster runs was built by your runner and stored on your registry, with nothing fetched from a third party at deploy time. For a platform you want to fully explain, that self-containment is worth the small operational cost.

The stack

Two containers, run with Docker Compose, plus a sprinkle of auth:

Registry-Server (10.100.100.6)
  registry        (registry:2)         :5000  - the registry API
  registry-ui     (joxit UI)           :8080  - a browser front-end
  auth: htpasswd  (user <REDACTED>)            - basic auth on pushes/pulls
  fronted by HAProxy: registry.example.com (API) + docker.example.com (UI)

A small but important detail: clients push to the public TLS name (registry.example.com), so Docker is happy about certificates, rather than to a plain-HTTP internal address (which would require fiddly "insecure registry" config on every client). One more payoff of terminating TLS at the edge.

Why we use this: a registry is "just" an HTTP API in front of a blob store — but auth, TLS, and a UI are what make it usable and safe. Assembling those from small pieces is more instructive than any managed registry.

Diagram

Push, pull, and a safe weekly garbage-collection sweep

Garbage collection without races

Registries accumulate cruft. Delete an image tag and the underlying data blobs don't vanish — they linger until a garbage collection sweep removes anything no longer referenced. Run that sweep carelessly, though, and you can race a push that's happening at the same moment and corrupt data.

The lab does it safely on a weekly schedule:

1. flip the registry to READ-ONLY (pushes blocked, pulls still work)
2. run garbage-collect (--delete-untagged)
3. prune now-empty repository directories
4. ALWAYS restore read-write   (even if the sweep failed)

The read-only window is the safety: with writes blocked, nothing can race the collector. A shell trap guarantees the registry is flipped back to read-write on exit no matter what — so a failed GC can never leave the registry stuck read-only. (A nice extra: the stock garbage-collect removes unused blobs but leaves empty repository names behind in the catalog; the routine prunes those empty directories too, so the listing stays clean.)

Lesson learned: any cleanup job that runs against live data needs a safety story. Here it's "make it read-only first, and guarantee you undo that even on failure." The trap … EXIT that always restores read-write is the difference between a maintenance job and an outage waiting to happen.

Lessons on the registry