Seeing what the cluster is doing: logs to Loki
The lab already had a central log store (Loki — see its book in Core Infrastructure), with a Promtail agent on every VM shipping system logs. When the cluster arrived, the job was to feed its logs in too.
On the Kubernetes nodes the node-level Promtail agent was extended to ship:
- Pod logs — tailing
/var/log/pods/*and parsing the path so each line is labelled with itsnamespace,pod, andcontainer. That's the big one: every pod in the cluster, searchable in one place. kubeletandcontainerdjournald units — the node's own story.
And the three database VMs ship their error and slow-query logs, so the slow-query logging turned on during tuning actually lands somewhere you can search it.
k8s nodes ----\
GIT-Runner ----\
PostgreSQL -----+
MariaDB -----+--> Loki (10.100.100.5) --> Grafana (search & dashboards)
MySQL -----+
everything else /
One wrinkle worth knowing: a slow-query entry is multi-line — a little block of timing metadata followed by the SQL. Promtail's default is one-line-per-entry, which would shred each slow query into meaningless fragments. So the database log jobs use a multiline stage that re-assembles each query into a single searchable entry. (And each multi-line source gets its own job — never globbed together with single-line logs like the error file, or the multiline rule would swallow those too.)
Why we use this: centralised logs turn "SSH into five boxes and grep" into one query. The moment you have more than two machines, shipping logs to one searchable place stops being nice-to-have. Labelling pod logs by namespace/pod/container, and re-assembling multi-line database logs, is what makes them actually usable instead of just present.
No comments to display
No comments to display