21-IND Team 2 · Enterprise Kubernetes Platform Engineering

01 · Overview

Platform first.
Applications next.
Security by design.

This capstone simulates a real enterprise Kubernetes environment — designed, deployed, secured, and operated end-to-end with alignment to the full CNCF certification track.

🏗️

Cluster Architecture & Operations

Multi-node clusters via kind, k3s, and EKS. Control plane inspection, namespace isolation, resource scheduling, and HPA-driven autoscaling.

📦

Application Delivery

Helm-packaged microservices with rolling updates, zero-downtime deployments, liveness/readiness probes, and externalised config via ConfigMaps and Secrets.

🔄

GitOps with ArgoCD

Declarative, Git-driven deployments with automatic reconciliation. Drift detection and self-heal demonstrated live against manually edited cluster state.

🛡️

Shift-Left Security

Trivy image scanning with CI exit-code gates, Sealed Secrets review, and manifest validation scripts blocking non-compliant deploys before any kubectl apply runs.

📊

Observability Stack

Prometheus metrics collection and Grafana dashboards. HPA events, resource trends, and incident simulation with documented recovery procedures.

⚡

CI/CD Automation

GitHub Actions pipelines: checkout → YAML validate → Docker build → security scan → Helm deploy. Blocking gates enforce quality before any apply runs.

02 · Skills Demonstrated

What We Built

Hands-on engineering across the full Kubernetes production stack — every skill listed was implemented, observed, and documented in lab.

CKA Phase · Weeks 1–6

Platform Engineering

⚙️Cluster Setup & kubectl

Bootstrapped multi-node kind clusters with custom config. Verified control plane health, inspected API server and DNS endpoints, and established kubeconfig contexts.

🗂️Namespaces & Isolation

Created namespace-scoped environments via YAML manifests. Verified workload isolation with kubectl exec and context switching enforced per team boundary.

📐Scheduling, Taints & Tolerations

Applied NoSchedule taints, confirmed pods remain Pending without matching tolerations, then added tolerations and observed placement. Used nodeSelector for additional control.

📈Horizontal Pod Autoscaler

Deployed Metrics Server, created HPA targeting CPU utilisation, generated synthetic load inside pods, confirmed automatic replica scale-up via HPA event log.

🔍Troubleshooting

Diagnosed and resolved four failure scenarios: CrashLoopBackOff (bad entrypoint), ImagePullBackOff (invalid tag), Pending pod (resource exhaustion), and Service selector mismatch.

🔄Rolling Updates & Rollback

Configured maxSurge: 1 / maxUnavailable: 0 for zero-downtime guarantees. Simulated a bad image deploy — old pods kept serving. Recovered instantly with kubectl rollout undo.

CKAD Phase · Weeks 7–10

Application Delivery & GitOps

⛵Helm Charts

Authored Helm charts parameterising image tag, replica count, and service type via values.yaml. Executed install, upgrade, and rollback — demonstrated full release lifecycle across 4 revisions.

🗃️ConfigMaps & Secrets

Injected config via envFrom (restart required) and volume mounts (auto-refresh without restart). Managed base64-encoded Secrets with --dry-run=client previews and safe describe inspection.

🪜Multi-Container Pods

Implemented sidecar log-reader pattern using emptyDir shared volume between nginx main app and busybox sidecar. Confirmed shared filesystem at runtime with pod showing 2/2 Ready.

🚀GitHub Actions CI Pipeline

Built four-stage pipeline: checkout → YAML syntax validation → Docker build → Helm deploy. Triggered on pull_request to main. Pipeline structure mirrors production GitOps workflows.

🔁ArgoCD & GitOps

Deployed ArgoCD with Application manifest pointing to Git repo. Enabled auto-sync and self-heal. Introduced live drift via kubectl edit — ArgoCD detected and reverted within seconds.

🛡️Image Scanning with Trivy

Scanned two container images for CRITICAL/HIGH CVEs. Used --exit-code 1 to block CI on critical findings, saving JSON audit output. Demonstrated warn-only vs. hard-gate comparison.

CKS Phase · Weeks 11–12

Security Hardening & Observability

🔐RBAC & Least Privilege

Provisioned dedicated ServiceAccounts per workload (trainer-sa, intake-sa, inference-sa, dashboard-sa). Namespace-scoped read-only Role + RoleBinding for the dashboard, limiting API access to pods, services, endpoints, and configmaps only.

🌐NetworkPolicies

Default-deny ingress for all pods in warehouse-cv with explicit allow rules for ingress-nginx → services and approved app-to-app paths only. Enforced via dev overlay, leaving no implicit inter-pod reachability.

🛡️Pod Security Standards

Pod Security Admission labels (restricted) applied to the application namespace. All workloads enforce runAsNonRoot, explicit UID 1000, seccompProfile: RuntimeDefault, and dropped all capabilities with privilege escalation disabled.

🔑Sealed Secrets

Cluster credentials delivered via SealedSecret — encrypted values committed to Git, decrypted only by the in-cluster controller. Setup script interactively reseals if the placeholder sentinel is detected, making credential rotation an explicit, auditable step.

📊Prometheus & Grafana

kube-prometheus-stack deployed via Helm. ServiceMonitor configured for the application namespace. Grafana dashboard provisioned via ConfigMap. System ingress exposes Prometheus and Grafana at dedicated internal hostnames.

🔄Drift Protection & CI Validation

ArgoCD auto-sync with prune, self-heal, and foreground pruning eliminates config drift. CI pipeline validates YAML with yamllint, renders Kustomize overlays, and runs kubeconform on all rendered output before any merge.

03 · Final Platform

Warehouse CV Platform

The capstone culminated in a fully deployed warehouse computer-vision pipeline — three Flask microservices running YOLO detection, managed entirely through GitOps.

📷

footage-intake

Serves camera frame data from local image files. Exposes /stream, /frame, and /health. HPA maintains minimum two replicas.

🤖

cv-inference

Fetches frames from intake and runs YOLO object detection. Returns JSON detections via POST /detect. Sealed credentials for model registry access.

🖥️

results-dashboard

Web UI that proxies the video stream and polls inference for live detections. Exposed at dashboard.warehouse-cv.internal via NGINX ingress.

⚡

model-finetune

Nightly CronJob for GPU-backed YOLO fine-tuning. Mounts a 20Gi PVC for model artifacts and consumes sealed object-store credentials. Scheduled for GPU nodes.

🔁

GitOps Delivery

Two ArgoCD applications sync from Kustomize overlays: warehouse-cv-dev and warehouse-cv-addons-dev. Auto-sync, prune, and self-heal enforce Git as the sole source of truth.

🏗️

One-Command Bootstrap

A single interactive script handles preflight checks, kind cluster creation, Sealed Secrets, ArgoCD, ingress-nginx, and kube-prometheus-stack — from zero to a fully synced cluster.

Namespaces

warehouse-cv    # footage-intake · cv-inference · results-dashboard · model-finetune
argocd          # GitOps controller — auto-sync, prune, self-heal
ingress-nginx   # North-south traffic, internal hostname routing
monitoring      # kube-prometheus-stack · ServiceMonitor · Grafana dashboard
kube-system     # Sealed Secrets controller

04 · Architecture

Stack Overview

The platform layers cluster infrastructure, application delivery, security controls, and observability into a cohesive production environment.

Cluster Infrastructure

kind k3s EKS NGINX Ingress

Application Layer

Frontend Backend Worker Helm Charts

CI/CD & GitOps

GitHub Actions ArgoCD Trivy

Security Controls

RBAC NetworkPolicy Pod Security Sealed Secrets

Observability

Prometheus Grafana Metrics Server

Certification Alignment

Focus Area	CKA	CKAD	CKS
Cluster Architecture	●
Namespaces & Quotas	●
Scheduling & Resources	●
Troubleshooting	●
Application Deployment		●
ConfigMaps & Secrets		●	●
Scaling & Probes	●	●
CI/CD & GitOps		●
RBAC & NetworkPolicy			●
Pod Security & Scanning			●
Monitoring & Observability	●

05 · Roadmap

12-Week Journey

Structured to mirror real Kubernetes industry adoption: platform fundamentals first, application patterns next, security hardening last.

WEEKS 01–02CKA

Kubernetes Foundations

Cluster setup with kind, kubectl fundamentals, namespaces, context switching, and basic Deployment + Service patterns.

Working cluster with namespaced environments

WEEKS 03–04CKA

Cluster Operations & Troubleshooting

Resource requests/limits, taints, tolerations, HPA, and debugging CrashLoopBackOff, ImagePullBackOff, and selector mismatches.

Resource isolation policies + troubleshooting report

WEEKS 05–06CKA

Application Deployment Phase

ConfigMaps and Secrets injection, liveness and readiness probes, rolling updates with maxUnavailable: 0, and rollback workflows.

Stable app deployments · CKA Exam Target

WEEKS 07–08CKAD

Application Patterns & CI/CD

Helm chart packaging, multi-container sidecar pods, GitHub Actions CI pipeline, and Blue-Green/Canary deployment strategies.

Helm-packaged apps + automated CI pipeline

WEEKS 09–10CKAD

GitOps & Application Security

ArgoCD auto-sync and self-heal, Trivy image scanning with CI gates, Sealed Secrets review, and manifest validation scripts.

GitOps pipeline · CKAD Exam Target

WEEKS 11–12CKS

Security & Observability

Least-privilege RBAC, scoped ServiceAccounts, NetworkPolicies, Pod Security Standards (restricted), Sealed Secrets, Prometheus + Grafana, and ArgoCD drift protection.

Warehouse CV platform — fully deployed ✓

06 · Deliverables

Milestone Outputs

Each phase closes with a concrete, demonstrable deliverable mapped to CNCF certification objectives.

3.1
Kubernetes Foundations
Working cluster with namespaced environments and baseline kubectl output as cluster snapshot.
3.2
Cluster Operations & Troubleshooting
Scalable workloads, resource isolation policies, and a documented troubleshooting report.
3.3
Application Deployment
Stable deployments, zero-downtime rolling updates, and externalised configuration management.
3.4
Application Patterns & CI/CD
Helm-packaged apps, automated CI pipeline with validation gate, and release workflow docs.
3.5
GitOps & Application Security
GitOps-driven deployments with Trivy-scanned images and CI-enforced security gates.
3.6
Kubernetes Security & Observability
Hardened cluster, Grafana dashboards, and live final architecture demo.

07 · Repository

Codebase Structure

All manifests, Helm charts, CI configs, and lab logs are version-controlled in a single monorepo.

k8s-enterprise-capstone-team2/

├── docs/           # Architecture diagrams and security models
├── Docker/         # Service Dockerfiles and image build script
├── k8s/            # Manifests: Base, Overlays, RBAC
├── gitops/         # ArgoCD ApplicationSets
├── monitoring/     # Prometheus + Grafana configs
├── scripts/        # cluster-setup.sh bootstrap script
└── labs/           # Weekly milestone logs

08 · Team

The Engineers

Five engineers building production-grade Kubernetes infrastructure and working toward CNCF certification.

Kunal Shenoi

Team Lead · Developer

Job King

kind & EKS · Validation

Jericho Tuazon

Developer

Abraham Dejene

Developer

Drew Patrick

Developer

Advisor / Instructor: Arthur Choi · Industry Sponsor: Sudheer Amgothu

Enterprise Kubernetes Platform Engineering

Platform first.Applications next.Security by design.

What We Built

Platform Engineering

Application Delivery & GitOps

Security Hardening & Observability

Warehouse CV Platform

Stack Overview

12-Week Journey

Milestone Outputs

Codebase Structure

The Engineers

Platform first.
Applications next.
Security by design.