· Satrajit Sengupta · Kubernetes · 6 min read
Kubernetes — a beginner's guide to container orchestration
Once you have more containers than you can manage manually, you need an orchestrator. Kubernetes is the standard — here's the architecture and the concepts you need to understand before running a single command.
The Docker series covers building and running containers. That knowledge is foundational — but it has a ceiling. At some point, you have too many containers to manage manually. They run across multiple machines, need to be restarted when they crash, scaled up under load, and updated without downtime. Doing this by hand doesn’t scale.
Kubernetes (K8s) exists to solve this. It’s the standard platform for managing containerised applications in production — open source, battle-tested across the industry, and the foundation that most cloud providers’ managed container services are built on.
This post covers the why and the architecture. Running actual workloads on Kubernetes comes in subsequent posts.
Why Kubernetes? The three eras of deployment
Traditional deployment — applications run on physical servers. No isolation between applications, so one application can consume all resources and starve others. Scaling means buying more hardware.
Virtual machine deployment — hypervisors allow multiple VMs on one physical server, each running its own OS. Better resource utilisation, isolated environments. But each VM carries the full weight of an operating system — gigabytes of overhead, minutes to boot, and OS licensing complexity.
Container deployment — containers virtualise the OS rather than the hardware. Milliseconds to start, megabytes in size, portable across any container runtime. A single machine can run dozens of isolated container workloads.
Containers solve the packaging and portability problem. But when you have hundreds of containers across a fleet of machines, new problems emerge: which machine does each container run on? What happens when a container crashes? How do you update containers without downtime? How does traffic get distributed across container replicas?
Kubernetes provides a framework to run distributed applications resiliently. It handles scheduling, self-healing, scaling, load balancing, and rolling deployments — automatically.
What Kubernetes actually does
At its core, Kubernetes is a cluster management system. You describe your desired state (“I want 3 replicas of this container, always”) and Kubernetes continuously works to make reality match that description.
Key capabilities:
Self-healing — if a container crashes, Kubernetes restarts it. If a node (machine) goes down, Kubernetes reschedules its containers on other nodes.
Automatic scaling — Kubernetes can scale the number of container replicas up or down based on CPU usage, memory usage, or custom metrics.
Rolling deployments and rollbacks — update containers with zero downtime; automatically roll back if the new version fails health checks.
Service discovery and load balancing — Kubernetes assigns DNS names to groups of containers and automatically load-balances traffic across them.
Storage orchestration — automatically mount storage volumes from local storage, cloud providers, or network filesystems.
Secret and configuration management — store sensitive values (API keys, passwords) and non-sensitive configuration separately from container images, and inject them at runtime.
Kubernetes architecture
Kubernetes follows a control plane / worker node model. The control plane is the cluster’s brain — it makes decisions about the cluster. Worker nodes are where your application containers actually run.
Control Plane (master nodes)
├── etcd
├── kube-apiserver
├── kube-scheduler
└── kube-controller-manager
Worker Nodes (1…n)
├── kubelet
├── kube-proxy
└── Container runtime (containerd, CRI-O)In production, run a minimum of three control plane nodes (for etcd quorum and high availability) and three worker nodes (to maintain redundancy if a node fails). Single-node setups exist for development (Minikube, k3s) but should never be used in production.
Control plane components
etcd
A distributed, fault-tolerant key-value store that holds the entire state of the cluster — every object, configuration, and status. If etcd loses data, the cluster state is gone. Back it up regularly.
Only the kube-apiserver communicates directly with etcd. No other component reads or writes cluster state directly.
kube-apiserver
The front door to the Kubernetes cluster. It exposes the Kubernetes API to both cluster components and the outside world (developers, CI/CD systems, other tools). Every kubectl command you run sends an API request to kube-apiserver.
The apiserver validates every request before persisting it to etcd. It scales horizontally — run multiple instances behind a load balancer for high availability.
kube-scheduler
Watches for newly created pods that have no assigned node and selects the best node for them. The decision is based on:
- Resource requirements of the pod (CPU, memory requests)
- Node resource availability
- Affinity/anti-affinity rules
- Taints and tolerations
- Hardware and software constraints
The scheduler doesn’t place the pod — it updates the pod’s spec with the selected node, and the kubelet on that node picks it up.
kube-controller-manager
Runs a collection of control loops (controllers), each responsible for managing a specific type of Kubernetes object:
- Node Controller — monitors node health; acts when nodes become unavailable
- Job Controller — creates pods for one-off batch jobs; ensures they complete
- Endpoints Controller — connects Service objects to the pods they route to
- Service Account and Token Controller — creates default service accounts in new namespaces
All controllers are compiled into a single binary and run as one process for operational simplicity.
Node components
kubelet
The agent that runs on every worker node. The kubelet watches the API server for pods assigned to its node, then ensures those pods are running and healthy. It talks directly to the container runtime to start, stop, and monitor containers.
Important: the kubelet only manages containers that Kubernetes created. Containers you start manually on the node with docker run are invisible to Kubernetes.
kube-proxy
A network proxy that runs on each node and implements the Kubernetes Service abstraction. It maintains network rules so that traffic to a Service IP gets forwarded to the correct pods, across all nodes in the cluster. It uses the OS-level packet filtering mechanism (iptables or IPVS) when available.
Container runtime
Kubernetes doesn’t run containers itself — it delegates to a container runtime that implements the Container Runtime Interface (CRI). The most common runtimes are:
- containerd — the default in most managed Kubernetes services
- CRI-O — a lightweight runtime built specifically for Kubernetes
- Docker Engine (via dockershim) — deprecated since Kubernetes 1.24
The cluster, in one paragraph
You write a manifest describing what you want to run (a Deployment, a Service, a ConfigMap). You send it to kube-apiserver with kubectl apply. The apiserver validates and stores it in etcd. kube-scheduler spots the new unscheduled pod, selects a node, and updates the spec. The kubelet on that node detects the pod assignment, pulls the container image, and starts the container via the container runtime. kube-proxy updates the network rules so traffic reaches the running pod. If the container crashes, the kubelet detects it and restarts it. This loop runs continuously — Kubernetes is always reconciling what you asked for with what’s actually running.
Next in the Kubernetes series: Kubernetes pods — the smallest unit of deployment