Garbage collection without races
Registries accumulate cruft. Delete an image tag and the underlying data blobs don't vanish — they linger until a garbage collection sweep removes anything no longer referenced. Run that sweep carelessly, though, and you can race a push that's happening at the same moment and corrupt data.
The lab does it safely on a weekly schedule:
1. flip the registry to READ-ONLY (pushes blocked, pulls still work)
2. run garbage-collect (--delete-untagged)
3. prune now-empty repository directories
4. ALWAYS restore read-write (even if the sweep failed)
The read-only window is the safety: with writes blocked, nothing can race the collector. A shell trap guarantees the registry is flipped back to read-write on exit no matter what — so a failed GC can never leave the registry stuck read-only. (A nice extra: the stock garbage-collect removes unused blobs but leaves empty repository names behind in the catalog; the routine prunes those empty directories too, so the listing stays clean.)
Lesson learned: any cleanup job that runs against live data needs a safety story. Here it's "make it read-only first, and guarantee you undo that even on failure." The
trap … EXITthat always restores read-write is the difference between a maintenance job and an outage waiting to happen.
No comments to display
No comments to display