i built a four-node pi cluster and learned almost nothing

A workbench scattered with soldering tools and small electronics

I have wanted a physical cluster on my desk for years. Not because I need one. I run real clusters at work and they live in a datacentre where they belong. But there is something about being able to point at four little boards and say "that is the control plane" that no amount of kubectl get nodes against a cloud provider gives you. So over the long weekend I finally built one, and I am here to report that it taught me almost nothing useful. It was tremendous fun.

Four Pi 4s, the 4GB ones, because the 8GB stock had evaporated months ago and I refused to pay scalper prices. A stack of PoE HATs, a small managed switch that does PoE, and a 3D-printed rack a friend let me steal an hour of his printer for. The plan was k3s, because full Kubernetes on a Pi is a way to discover how much memory the kubelet wants for itself.

A close-up of a populated circuit board

The build went fine right up until the power. PoE seemed elegant: one cable per node, no wall-wart octopus. What the marketing does not mention is that a Pi 4 under load through a cheap PoE HAT will happily brown out the moment you ask it to do anything strenuous, like compile something or, in my case, pull four container images at once. I got intermittent SD card corruption that I chased for an embarrassingly long time before I noticed the undervoltage flag in vcgencmd get_throttled. The fix was boring. A switch with a proper power budget, and moving the OS off the SD cards onto USB SSDs.

$ vcgencmd get_throttled
throttled=0x50005

That 0x50005 is the Pi quietly telling you it has been throttled and has seen undervoltage, both now and at some point in the past. I now check it reflexively on any Pi that misbehaves.

k3s came up in minutes once the hardware stopped lying to me. And then I sat there, looking at a working four-node cluster, and realised I had nothing to run on it. I deployed a Pi-hole. I deployed a small Prometheus and Grafana so I could watch graphs of a cluster doing nothing. I briefly considered moving some home automation onto it and then remembered that the whole appeal of home automation is that it keeps working when I am tinkering, which this would not.

Here is the nothing I learned. Distributed systems are not hard because of the wiring. The wiring is easy and faintly satisfying. They are hard because of the bits this exercise carefully avoided: real load, partial failure under that load, and the slow accretion of state you cannot afford to lose. My cluster has none of those. It is four computers agreeing about an empty workload, which is the easiest agreement in the world.

What I did get was tactile. When a node drops now, on the real systems, I picture one of those little boards going dark, and the abstraction feels less like magic. That is worth something, even if it is not on any roadmap. The cluster lives on the shelf above my desk, blinking gently, doing not very much, and I am unreasonably fond of it.