I have a small ARM single-board computer doing a job in the house, and compiling on it is an exercise in patience. The thing has the build performance of a damp matchstick. So I don't. One of Go's genuinely lovely features is that cross-compilation is two environment variables and nothing else:
GOOS=linux GOARCH=arm64 go build -o myservice ./cmd/myservice
That's it. From my x86 laptop I get an arm64 binary, scp it over, and run it. No cross-toolchain to install, no --host triple to look up, no autotools incantation half-remembered from 2009. The Go toolchain ships the standard library for every target already compiled, so the first build for a new architecture isn't even meaningfully slower.
If your board is 32-bit ARM rather than 64, you also need the variant:
GOOS=linux GOARCH=arm GOARM=7 go build -o myservice ./cmd/myservice
GOARM=7 for anything from a Raspberry Pi 2 onwards, GOARM=6 for the older Pi Zero and the original Pi. Get this wrong and the binary either won't run or runs slowly because it's avoiding instructions your chip actually has.
the one thing that breaks it
The whole trick depends on CGO_ENABLED=0. The moment any package in your dependency tree pulls in cgo, you're no longer cross-compiling Go, you're cross-compiling C, and that needs a real cross-toolchain with all the misery that implies. The usual culprits are the standard net package's DNS resolver and os/user, both of which have cgo-backed and pure-Go implementations.
So I'm explicit about it:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -o myservice ./cmd/myservice
With cgo off, Go uses its pure-Go DNS resolver and the build stays portable. The binary is also fully static, which means it doesn't care what's installed on the target. It'll run on a stripped-down Alpine image, a Debian board, or a FROM scratch container, with no shared libraries to chase.
The one time this bit me was a dependency that quietly needed SQLite via mattn/go-sqlite3, which is cgo all the way down. There's no flag that fixes that; you either set up a proper cross-compiler or swap to a pure-Go SQLite driver. I swapped the driver, the build went back to two environment variables, and I went back to never compiling on the matchstick again.
the loop I actually use
In practice it's a one-liner in a Makefile so I never type the variables wrong:
deploy:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -o build/myservice ./cmd/myservice
scp build/myservice pi@homebox:/opt/myservice/myservice
ssh pi@homebox systemctl --user restart myservice
make deploy, wait a few seconds, done. The board never compiles anything. It just runs the thing I handed it, which is exactly the division of labour I want between a capable laptop and a board the size of a credit card.