Gemini has dominated my feeds for two weeks now, and I've noticed I keep skipping past the headline numbers to find the one detail nobody's leading with: there's a small version meant to run on-device.
Everyone's arguing about the benchmark tables and the over-edited demo video, which is fair enough, that's the spectacle. But the part that'll matter to me as an engineer isn't the giant model in a datacentre I'll never see inside. It's the smallest tier, the one aimed at phones and modest hardware, where the inference happens locally and nothing leaves the device.
That's the bit I find quietly interesting, because it's the bit I can imagine actually using. A capable model I can run on hardware I own, without shipping every query off to someone else's servers, is a different proposition from a big hosted model behind an API and a meter. The big one is impressive. The small one is useful to me specifically.
It's early, the small-on-device story is more promise than delivery right now, and I've learned this fortnight not to trust launch-day claims. But of everything dominating tech this week, that's the thread I'll be pulling on once the noise dies down. The headline model gets the keynote. The little one that runs on my own kit is the one I'll be watching.