kubernetes at home was a mistake, mostly

A server rack with multiple nodes and cabling

I ran Kubernetes at home for the best part of a year. I am going to tell you it was a mistake, and then I am going to spend the rest of the post quietly admitting it was not entirely one, because honesty demands both. If you want the short version: do not put your homelab on Kubernetes to run your homelab. Put it on Kubernetes to learn Kubernetes, and be clear with yourself about which of those you are doing.

what i was actually running

The workload, such as it was: a Pi-hole, a Nextcloud instance, a couple of static sites, a Grafana, and a recipe app my partner uses roughly weekly. That is the entire enterprise. None of it is stateless, most of it has fiddly storage requirements, and the total resource footprint would fit comfortably on a single mid-range machine with room to spare.

On top of that I built a three-node cluster. kubeadm, a control plane, worker nodes, a CNI, an ingress controller, cert-manager for TLS, MetalLB so I could have LoadBalancer services on bare metal, and persistent volumes backed by NFS off the NAS. It was, I will say in my defence, a genuinely lovely thing to look at. kubectl get pods -A returned a satisfying wall of Running.

A homelab setup with networking gear and a small cluster

where it went wrong

The trouble with Kubernetes at this scale is that the cluster becomes the workload. I spent far more time keeping Kubernetes healthy than keeping my actual services healthy, and that ratio never improved.

Storage was the worst of it. Stateful apps on Kubernetes are a known sharp edge, and on a homelab you get all the sharpness with none of the operational team that makes it bearable. NFS-backed persistent volumes are fine until a node reboots at an awkward moment and a pod comes up before the mount is ready, and now Nextcloud is staring at an empty data directory and you are reading volume-attachment logs at midnight. I lost an evening to a Pending pod that turned out to be a PV stuck Released because of a reclaim policy I had set and forgotten.

Then there is the upgrade treadmill. Kubernetes moves fast, deprecates APIs with enthusiasm, and a cluster you stand up and ignore for six months is a cluster that will fight you when you finally touch it. The control plane is another set of components that can break independently of anything you care about. etcd, in particular, does not care about your weekend.

The failure modes were also wildly disproportionate to the stakes. When the recipe app went down because a node's kubelet wedged, the blast radius and the debugging effort were the same as they would be for a production outage at work, except the consequence was that we ordered a takeaway. That is a lot of YAML to protect a lasagne.

what it got right

A close-up of cabling and a small home server stack

And yet. I cannot pretend it was wasted, because two things came out of it that I value.

First, I learned Kubernetes properly, in the only way that actually sticks, which is by breaking it and fixing it on a system I owned end to end. Reading about etcd quorum is one thing. Recovering a cluster after losing a control-plane node, with your own data on the line and nobody to escalate to, teaches you what the documentation cannot. Every gnarly thing I hit at home, I have since recognised instantly at work, and that has paid for the evenings several times over.

Second, the bits of the ecosystem that are genuinely excellent are excellent even at small scale. Declarative config that I keep in Git and can reapply onto a freshly rebuilt node is a real pleasure. When I rebuild a service, I do not reconstruct it from memory and half-remembered commands. I kubectl apply a directory and it is back. cert-manager handling TLS renewals so I never think about certificates again is the kind of boring automation that justifies itself.

what i would do instead

If I were starting over for a homelab whose job is to run a homelab, I would reach for something far smaller. A single box with Docker Compose covers nearly everything I listed, fits in a head, and fails in ways I can debug whilst still half asleep. If I wanted the declarative-and-in-Git feel without the control-plane tax, I would look hard at k3s, which trims a lot of the heaviness and runs happily on modest hardware, and I would keep it to a single node unless I had a concrete reason not to.

So: was Kubernetes at home a mistake? As infrastructure for five low-stakes services, yes, comfortably. As a year-long, hands-on, occasionally infuriating education that I now lean on constantly, no, not at all. The mistake was not in running it. The mistake was in pretending the lasagne needed it. It did not. I did, and that turned out to be a perfectly good reason, just not the one I told myself at the time.