Appendix B: Mental Models
Each part of this book introduces a cluster of related concepts. These diagrams show how they connect — use them as maps when navigating the chapters.
Part 1: First Principles (Chapters 1-9)
The Reconciliation Loop — the heart of Kubernetes.
flowchart TD
A["User writes YAML"] --> B["kubectl"]
B --> C["API Server"]
C --> D["etcd<br>(desired state stored here)"]
C --> E["Controller Manager<br>(reconcile)"]
C --> F["Scheduler<br>(assign to node)"]
E --> G["kubelet (on node)"]
F --> G
G --> H["Container Runtime"]
H --> I["Container"]
subgraph loop ["The Watch / Reconciliation Loop"]
direction LR
W1["Controller watches"] --> W2["Detects drift"]
W2 --> W3["Compares desired<br>vs. actual state"]
W3 --> W4["Takes action<br>to converge"]
W4 --> W1
end
Part 2: Tooling Evolution (Chapters 10-14)
The Stack — what runs on what.
flowchart TD
S1["Application"]
S2["Helm / Kustomize (packaging)"]
S3["kubeadm / k3s (bootstrap)"]
S4["Kubernetes API"]
S5["Container Runtime<br>containerd / CRI-O"]
S6["CNI Plugin<br>Cilium, Calico, Flannel ..."]
S7["OCI Runtime (runc)"]
S1 --> S2 --> S3 --> S4
S4 --> S5
S4 --> S6
S5 --> S7
S6 --> S7
subgraph kernel ["Linux Kernel"]
K1["cgroups<br>(resource limits)"]
K2["namespaces<br>(isolation)"]
end
S7 --> kernel
subgraph cni ["CNI Virtual Network"]
direction LR
PA["Pod A"] <--> PB["Pod B"]
end
S6 --> cni
Part 3: Practical Setup (Chapters 15-19)
Your First Cluster — who talks to whom.
flowchart TD
Cloud["Cloud Provider<br>(AWS / GCP / AZ)"]
Cloud --> VPC
kubectl["kubectl"] --> API
CICD["CI/CD Pipeline"] --> VPC
subgraph VPC
subgraph CP ["Control Plane (managed)"]
API["API Server"]
end
subgraph N1 ["Worker Node 1"]
subgraph Pod1 ["Pod"]
App["app"]
Sidecar["sidecar"]
end
end
subgraph N2 ["Worker Node 2"]
Pod2a["Pod"]
Pod2b["Pod"]
end
end
API --> N1
API --> N2
subgraph debug ["Debugging Tools"]
direction LR
KC["kubectl"] --> Logs["logs"]
KC --> Exec["exec"]
KC --> Describe["describe (events)"]
end
Part 4: Stateful Workloads (Chapters 20-24)
State — the hard problem.
flowchart TD
Deploy["Deployment<br>(stateless)<br>Pods are fungible,<br>interchangeable"]
SS["StatefulSet<br>(ordered, stable ID)<br>pod-0, pod-1, pod-2<br>each has stable name"]
SS --> PVC["PVC<br>(claim storage)"]
PVC --> PV["PV<br>(actual volume)"]
PV --> SC["StorageClass<br>(provisioner)"]
SC --> Disk["Cloud Disk<br>(EBS / PD / AzD)"]
subgraph operators ["Operators manage databases on K8s"]
direction LR
Op["Operator"] -->|watches| CRD["CRD<br>(e.g. PostgresCluster)"]
CRD -->|manages| Res["StatefulSet +<br>PVCs + Secrets"]
end
subgraph jobs ["Jobs and CronJobs"]
direction LR
Job["Job<br>(run once)"]
CronJob["CronJob<br>(scheduled)"] -->|creates Job<br>on schedule| Job
end
Part 5: Security (Chapters 25-29)
Defense in Depth.
┌────────────────────────────────────────────────────────┐
│ Supply Chain (outermost ring) │
│ Sigstore, SBOM, image scanning │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Cluster │ │
│ │ RBAC, Admission Control (OPA/Kyverno) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────┐ │ │
│ │ │ Namespace │ │ │
│ │ │ NetworkPolicy, ResourceQuota │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────────────────────────┐ │ │ │
│ │ │ │ Pod │ │ │ │
│ │ │ │ SecurityContext, Seccomp, │ │ │ │
│ │ │ │ AppArmor │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ ┌─────────────────────────┐ │ │ │ │
│ │ │ │ │ Container (innermost) │ │ │ │ │
│ │ │ │ │ read-only rootfs │ │ │ │ │
│ │ │ │ │ non-root user │ │ │ │ │
│ │ │ │ │ dropped capabilities │ │ │ │ │
│ │ │ │ └─────────────────────────┘ │ │ │ │
│ │ │ └─────────────────────────────────┘ │ │ │
│ │ └─────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────┘
Secrets Management (cross-cutting concern):
┌──────────────────────────────────────────────┐
│ │
│ External Secrets ──▶ K8s Secret ──▶ Pod │
│ │ │
│ Vault / AWS SM / GCP SM │
│ (source of truth) │
│ │
│ Cuts across ALL rings above │
└──────────────────────────────────────────────┘
Part 6: Scaling (Chapters 30-33)
The Scaling Cascade — metrics to machines.
flowchart TD
M["Metrics<br>(CPU, memory, custom)"]
M --> HPA["HPA"]
HPA -->|"scale pods<br>horizontally"| Pods["More Pods"]
HPA -->|"pods go Pending<br>(no capacity)"| KCA["Karpenter /<br>Cluster Autoscaler"]
KCA -->|"scale nodes"| Cloud["Cloud API<br>(provision new VMs)"]
subgraph vpa ["VPA (Vertical Pod Autoscaler)"]
direction LR
VM["Metrics"] --> VPA2["VPA"] --> Resize["Resize pods vertically<br>(adjust requests/limits)"]
end
subgraph scheduling ["Resource Tuning feeds Scheduling"]
direction LR
RL["requests and limits<br>(CPU, memory)"] --> Sched["Scheduler decisions"]
RL --> Effects["Affects bin-packing,<br>QoS class, eviction<br>priority, HPA thresholds"]
end
Part 7: Platform Engineering (Chapters 34-39)
The Platform — abstraction over infrastructure.
flowchart TD
Dev["Developer"] -->|writes Claim| PlatAPI["Platform API<br>(Crossplane XRD / CRD)"]
PlatAPI -->|provisions| CloudRes["Cloud Resources<br>(RDS, S3, etc.)"]
Git["Git Repo<br>(source of truth)"] -->|GitOps loop| Argo["ArgoCD / Flux"]
Argo -->|sync| Clusters["Cluster(s)"]
subgraph ext ["Extension Mechanism"]
direction LR
CRD["CRD<br>(defines new API)"] --> Operator["Operator<br>(watches & reconciles)"] --> Resources["Manages resources"]
end
subgraph horiz ["Horizontal Concerns"]
MC["Multi-Cluster<br>(fleet mgmt, federation)"]
MT["Multi-Tenancy<br>(namespaces, vClusters,<br>resource quotas)"]
end
Part 8: Advanced Topics (Chapters 40-45)
Running it for Real.
Operational Concerns:
┌──────────────────────────────────────────────────────┐
│ │
│ ┌──────────┐ ┌────────────────┐ ┌─────────────┐ │
│ │ etcd ops │ │ Disaster │ │ Cost │ │
│ │ (backup, │ │ Recovery │ │ Optimization│ │
│ │ defrag, │ │ (Velero) │ │ (right-size,│ │
│ │ health) │ │ │ │ spot, idle)│ │
│ └──────────┘ │ backup ──▶ │ └─────────────┘ │
│ │ restore ──▶ │ │
│ │ migrate │ │
│ └────────────────┘ │
└──────────────────────────────────────────────────────┘
Observability (the three pillars):
┌───────────┐
│ Metrics │
│(Prometheus│
│ / Mimir) │
└─────┬─────┘
│
┌─────────┼─────────┐
│ │ │
▼ ▼ ▼
┌───────┐ ┌───────┐ ┌────────┐
│ Logs │ │Traces │ │Alerts │
│(Loki) │ │(Tempo)│ │(Grafana│
└───────┘ └───────┘ │ / PD) │
└────────┘
GPU Scheduling:
┌──────────────────┐ ┌──────────────────┐ ┌────────────┐
│ Pod with │────▶│ Device Plugin / │────▶│ NVIDIA GPU │
│ gpu request │ │ DRA │ │ (on node) │
│ (limits: │ │ (allocates GPU) │ │ │
│ nvidia.com/gpu)│ └──────────────────┘ └────────────┘
└──────────────────┘
LLM Serving:
┌─────────┐ ┌──────────────┐ ┌──────────┐ ┌────────────┐
│ Model │───▶│ vLLM / TGI │───▶│ KServe │───▶│ Inference │
│(weights)│ │ (serving │ │ (routing,│ │ endpoint │
│ │ │ engine) │ │ scaling)│ │ (/predict) │
└─────────┘ └──────────────┘ └──────────┘ └────────────┘
Back to Table of Contents