Assignment 1: CAPSTONE — Ship a full stack end to end
Goal: Prove you can take a real application all the way through the platform you've built across the whole internship — containerize it (Module 6), push it through a CI/CD pipeline (Module 8), deploy it to the live Kubernetes cluster (Module 7), and observe it in Grafana with logs in Loki, a dashboard, and a working alert (Module 11). This is the program capstone: it re-exercises every earlier module in one connected flow.
Where: The lab. Source/CI in Gitea + the Git-Runner (10.100.100.11); image in the private registry (Registry-Server, 10.100.100.6); runtime on the live k8s cluster (master 10.100.100.7, workers .8/.9/.10); observability in Grafana (10.100.100.4) with logs in Loki (10.100.100.5). Reach everything through the Jumpbox bastion.
Tasks
- Pick a small app. Any HTTP app that logs to stdout and exposes a
/metricsendpoint (or add one). A tiny web API in any language is perfect. Keep it simple — the point is the pipeline, not the app. - Containerize it (Module 6). Write a
Dockerfilefollowing house conventions: multi-stage build, pinned base image by digest, a non-root runtime user, aHEALTHCHECK, OCI labels, and a matching.dockerignore. Build it locally and confirm it runs. - Push to the private registry (Module 6). Tag the image for the lab registry with a sha-based tag and push it. Confirm you can pull it back.
- Write the Kubernetes manifests (Module 7). A
Deployment(with resource requests/limits, readiness + liveness probes hitting your health endpoint), aService, and anIngress(or Kong route). Reference the image by its registry path + sha tag. Use a Helm chart if you prefer. - Wire up CI/CD (Module 8). Create a Gitea Actions workflow that runs on push: build the image, tag it with the commit sha, push it to the registry, then deploy/upgrade the app on the cluster (e.g.
kubectl apply/helm upgrade) using the runner. A commit should result in a new version running, with no manual steps. - Make it emit telemetry (Module 11). Ensure the app logs structured lines to stdout and exposes Prometheus metrics. Confirm pod logs are flowing into Loki at 10.100.100.5 (Promtail/Alloy already tails the cluster) by querying them in Grafana Explore.
- Build a dashboard (Module 11). In Grafana (10.100.100.4), create a dashboard for your app showing at least the RED signals — request Rate, Error rate, and Duration/latency — plus a logs panel from Loki.
- Create an alert (Module 11). Add a Grafana alert rule that fires when your app's error rate (or pod-down condition) crosses a threshold for a sustained period, routed to a contact point.
- Prove the loop end to end. Make a code change that intentionally produces errors, push it, watch CI build+deploy the new sha, see the error rate climb on your dashboard, and see the alert fire — then fix it and watch it recover.
Deliverable
A short write-up (markdown) plus the artifacts: the Git repo URL (Dockerfile, manifests/Helm chart, CI workflow), the registry image path + sha tag, the running app's URL, a link/screenshot of your Grafana dashboard, the alert rule definition, and a screenshot of a LogQL query showing your app's logs in Loki. Include a one-paragraph "what I'd improve next" reflection.
Acceptance criteria — you're done when:
- An app with stdout logging and a
/metricsendpoint is committed to a Gitea repo. - A house-convention
Dockerfile(multi-stage, pinned digest, non-root, HEALTHCHECK, OCI labels) and.dockerignoreexist, and the image builds. - The image is pushed to the private registry (10.100.100.6) with a sha-based tag and can be pulled back.
- Kubernetes manifests/Helm chart exist with resource limits and readiness + liveness probes referencing the health endpoint.
- A Gitea Actions workflow builds, tags by commit sha, pushes, and deploys/upgrades on the cluster automatically on push — no manual steps.
- The app is running on the live k8s cluster and reachable at its URL (via Ingress/Kong).
- The app's pod logs are queryable in Grafana Explore via LogQL from Loki (10.100.100.5), e.g.
{namespace="<your-ns>"}. - A Grafana dashboard (10.100.100.4) shows the app's Rate, Errors, and Duration, plus a Loki logs panel.
- A Grafana alert rule exists, has a threshold and a "for" duration, and is routed to a contact point.
- You demonstrated the full loop: a pushed change triggered CI → new sha deployed → dashboard reflected it → alert transitioned Pending → Firing → Resolved after the fix.
- The write-up links all artifacts (repo, image tag, app URL, dashboard, alert, LogQL screenshot) and includes a reflection.
Hints
- Start with the app running locally in a container before touching k8s — debug one layer at a time.
- Build the RED dashboard from the queries in the Grafana and Backends lessons:
sum(rate(http_requests_total{status=~"5.."}[5m]))for errors. - For logs, confirm flow with the broadest query first (
{namespace="<your-ns>"}) then narrow with|= "error". - Re-use the Docker and Helm skills/conventions from earlier modules rather than reinventing them.
- To test the alert without breaking real traffic, lower the threshold temporarily, or add a
/boomroute that returns 500s. - Build incrementally and commit often — if CI breaks, the last green commit tells you what changed.
blocked for >~30 min after re-reading the lessons? Bring what you've tried to your mentor.
No comments to display
No comments to display