Kubernetes from First Principles

Why It Works the Way It Does

Most Kubernetes resources teach you how to write YAML. This book teaches you why the YAML looks the way it does. 45 chapters — each traced from the original design problem through the ecosystem’s evolution to today’s best practice.

About

This is an eight-part book that takes you from “why does Kubernetes exist?” to “I’m running GPU-accelerated ML workloads in production across multiple clusters.” It is written for engineers who understand Linux, networking, and how systems work — and want to understand Kubernetes deeply, not just follow tutorials.

Part 1: First Principles

Why Kubernetes was designed the way it was.

The Road to Kubernetes — From bare metal to Borg to Kubernetes
The Problems Kubernetes Solves — Bin packing, service discovery, self-healing, and the desired state model
Architecture from First Principles — etcd, API server, controllers, scheduler, kubelet, kube-proxy
The API Model — Resources, specs, status, reconciliation loops, labels, and CRDs
The Networking Model — Flat networking, CNI, Services, Ingress, and Network Policies
The Ecosystem — Operators, Helm, service meshes, and Kubernetes as a platform for platforms
Key Design Principles — Declarative over imperative, control loops, level-triggered vs edge-triggered
Why Kubernetes Won — The competitive landscape and the deeper architectural lesson
References and Further Reading — Foundational papers, design documents, talks, and books

Part 2: The Tooling Ecosystem — History and Evolution

How the tools around Kubernetes evolved, and why they look the way they do today.

The Container Runtime Wars — Docker to containerd to CRI-O: why Docker was deprecated
Bootstrapping a Cluster — From kube-up.sh to kubeadm: how cluster setup evolved
Package Management and GitOps — Helm v2/v3, Kustomize, ArgoCD, Flux
The Networking Stack Evolution — Flannel to Calico to Cilium: how eBPF changed everything
Kubernetes Version History — A guided tour of key releases and what they introduced

Part 3: From Theory to Practice

Connecting the principles from Part 1 to real-world usage.

Setting Up a Cluster from Scratch — What kubeadm actually does: TLS bootstrapping, static pods
Managed Kubernetes: EKS, GKE, and AKS — Cloud provider comparison and how to choose
Cloud Networking and Storage — VPC CNI, CSI drivers, and how K8s maps to cloud infrastructure
Your First Workloads — Hands-on: Deployments, Services, ConfigMaps, rolling updates
Debugging Kubernetes — The kubectl toolkit and diagnosing common failures
Production Readiness — Monitoring, logging, security basics, and backup

Part 4: Stateful Workloads

Running real applications with persistent state.

StatefulSets Deep Dive — Stable identities, ordered operations, and headless Services
Databases on Kubernetes — When to run databases on K8s, operators, and the trade-offs
Persistent Storage Patterns — volumeClaimTemplates, reclaim policies, backup, and resize
Jobs and CronJobs — Batch processing, indexed completions, and scheduling patterns

Part 5: Security Deep Dive

Understanding and implementing Kubernetes security from the ground up.

RBAC from First Principles — Roles, bindings, ServiceAccounts, and multi-tenant design
Network Policies — Default deny, namespace isolation, and egress control
Supply Chain Security — Image signing, admission policies, scanning, and SLSA
Secrets Management — Encryption at rest, Vault, External Secrets Operator, and best practices
Pod Security Standards — Privileged, Baseline, Restricted profiles and enforcement

Part 6: Scaling and Performance

Making Kubernetes handle real-world load.

Horizontal Pod Autoscaler — The scaling algorithm, custom metrics, KEDA, and tuning
Vertical Pod Autoscaler and Right-Sizing — Recommendation mode, in-place resize, and resource tuning
Node Scaling: Cluster Autoscaler and Karpenter — How nodes scale, Karpenter’s architecture, and consolidation
Resource Tuning Deep Dive — CPU throttling, memory cgroups, NUMA, and overcommitment

Part 7: Multi-Cluster and Platform Engineering

Operating Kubernetes at organizational scale.

Multi-Cluster Strategies — Federation, GitOps-driven, service mesh, and Cluster API
Building Internal Developer Platforms — Backstage, the platform stack, and reducing cognitive load
Crossplane: Infrastructure as CRDs — Managing cloud resources through Kubernetes
Multi-Tenancy — Namespace isolation, virtual clusters, and tenant boundaries

Part 8: Advanced Topics

Deep dives for infrastructure engineers.

Writing Controllers and Operators — controller-runtime, Kubebuilder, and the Reconcile pattern
The Kubernetes API Internals — Aggregation, admission webhooks, API priority and fairness
etcd Operations — Backup, restore, compaction, monitoring, and disaster recovery
GPU Workloads and AI/ML on Kubernetes — Device plugins, DRA, GPU sharing, distributed training
Running LLMs on Kubernetes — vLLM, TGI, KServe, multi-node inference, and model serving
Disaster Recovery — Cluster backup, etcd snapshots, multi-region strategies
Cost Optimization — Right-sizing, spot instances, Kubecost, and chargeback
Observability with OpenTelemetry — Metrics, logs, traces, and the OTel Collector

How to Read This

Part 1 is the intellectual foundation. Read it first.

Part 2 fills in the historical context of the tooling. Read it after Part 1.

Part 3 is hands-on. Reference it as you work through your own cluster.

Parts 4-5 cover stateful workloads and security — essential for running real production systems.

Part 6 covers scaling — read it when your workloads need to handle real load.

Part 7 is for when you’re operating multiple clusters or building a platform team.

Part 8 is deep reference material. Read chapters as needed. The GPU/ML chapters (41-42) are especially relevant for AI infrastructure teams.

If you only have time for one chapter from each part:

Part 1: Architecture from First Principles
Part 2: The Container Runtime Wars
Part 3: Debugging Kubernetes
Part 4: StatefulSets Deep Dive
Part 5: RBAC from First Principles
Part 6: Node Scaling: Cluster Autoscaler and Karpenter
Part 7: Building Internal Developer Platforms
Part 8: GPU Workloads and AI/ML on Kubernetes

Appendices

Appendix A: Glossary — Quick-reference definitions for 100+ Kubernetes terms
Appendix B: Mental Models — Visual diagrams showing how concepts in each part connect
Appendix C: Decision Trees — Flowcharts for choosing workload types, storage, networking, and tools
Appendix D: Troubleshooting Quick Reference — Error messages mapped to root causes and fixes
Appendix E: Architecture Evolution Timeline — How the Kubernetes ecosystem evolved from 2013 to today

Companion Material

install.sh — The bootstrap script we built to provision Kubernetes nodes on EC2
Colophon — How this book was made, the prompts used, and accuracy notes

Keyboard shortcuts

Kubernetes from First Principles

Kubernetes from First Principles

Why It Works the Way It Does