At some point my homelab stopped being a hobby and became infrastructure other people in the house depend on. When my partner says "the films aren't working", that's not a fun debugging puzzle any more, that's a service outage with a stakeholder standing in the doorway. The thing that made this manageable, more than any single piece of software, was committing to one Docker Compose file as the source of truth for the whole house.
Not literally one file in the end, but one repository, one declarative description of what should be running, that I can read top to bottom and understand. Before this I had containers started by hand, a couple of things in systemd, one thing I genuinely could not remember how I'd started, and a docker ps output that surprised me every time. The move to Compose wasn't about Docker being magic. It was about writing down what's supposed to be true so that future-me, at 11pm, with the family wanting their telly, doesn't have to reverse-engineer it.
the shape of it
Everything lives in a git repo on the server, backed up off-site. The stack is split into a few Compose files grouped by purpose, brought together so I can run them as one or in pieces.
services:
traefik:
image: traefik:v2.6
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./traefik:/etc/traefik
networks:
- edge
pihole:
image: pihole/pihole:2022.02
restart: unless-stopped
environment:
TZ: "Europe/London"
volumes:
- ./pihole/etc:/etc/pihole
- ./pihole/dnsmasq.d:/etc/dnsmasq.d
networks:
- edge
Traefik sits at the front and does TLS and routing by reading labels off the other containers, so adding a new service is mostly a matter of giving it the right labels and a hostname. Pi-hole does DNS for the whole house, which is wonderful right up until you reboot the box and the entire family loses internet because their DNS server is "still coming up". That happened once. It taught me to think hard about which services are load-bearing for everyone versus which are just for me.
the conventions that actually matter
The software is the easy part. The discipline is what keeps it alive.
restart: unless-stoppedon everything that matters. If the box reboots, the house comes back without me. This is non-negotiable for anything my family touches.- Named volumes or bind mounts, never anonymous. I want to know exactly where every container's state lives, because that's the thing I have to back up. Anonymous volumes are how you discover, during a restore, that your data was somewhere you'll never find again.
- Pinned image tags, not
latest.latestis how a quiet Tuesdaydocker compose pullturns into an unplanned outage when an upstream image changes behaviour. I pin versions and bump them on purpose, when I have time to watch. - One
.envfor secrets, never in the repo. The Compose files are committed; the.envnext to them is in.gitignoreand backed up separately. Obvious in hindsight, easy to get wrong once. - A comment for every non-obvious choice. Three months from now I will not remember why a particular container needs
cap_add. The comment is a love letter to future me.
the load-bearing problem
The hardest thing about running the house off one box isn't technical, it's that you've created a single point of failure with emotional consequences. DNS is the sharp edge. When Pi-hole is the only resolver and it's down, nothing works and nobody believes you that "it's just DNS". I eventually added a secondary resolver so a reboot of the main box doesn't take the whole house offline, and I treat the DNS container with more care than anything else in the stack. It's the one service where downtime is everyone's problem, not mine.
What I'd tell anyone starting this: the value isn't in the containers, it's in the act of writing down what should be running and keeping that description honest. A Compose file you trust is a thing you can reason about at 11pm. A pile of containers you started by hand over two years is not, and the house will find out which one you have at the worst possible moment.
Mine's not glamorous. It restarts when the power flickers, the films work, and I can read the whole thing in one sitting. For house infrastructure, that's the entire game.