Late July gave us Meta's Llama 3.1, and the headline number is the 405-billion-parameter model. The discourse since has been relentless: an "open" frontier-class model, free to download, with benchmarks Meta is happy to put next to the closed labs. Every feed I have has been arguing about it for two weeks. I've been quietly more interested in a duller question than the benchmarks, which is: who can actually run this thing?
Because that's the bit the launch coverage skates over. "Open weights, download it today" is true and also a little cheeky. The 405B model in full precision is hundreds of gigabytes. You are not running it on the laptop you read the announcement on. You're running it on a small rack of expensive accelerators, or you're renting that rack by the hour, or you're going back to an API and the openness becomes a sort of philosophical comfort rather than a practical one.
I don't say that to be sour about it. The release genuinely matters, and the part that matters most is the smaller siblings. An 8B model you can run locally, that's now meaningfully better than last year's, is the thing that changes what a person can do on their own hardware. The 405B is the trophy on the shelf; the 8B is the tool in the drawer. Most of the coverage has it the wrong way round, because the big number is the better story.
There's also the licence argument, which is where "open" earns its scare quotes. This is open weights under a community licence, not open source in the way the term has meant for thirty years. There are conditions, there's an acceptable-use policy, there's a clause about very large deployments. Reasonable people can call that open-ish and reasonable people can object that the word is being stretched. I land on: it's a great deal more open than a closed API, it's not the GPL, and pretending those are the same thing helps nobody.
What it's changed for me, practically, this fortnight, is that I've gone back to running a local model for the small stuff. Reformatting text, drafting a commit message, summarising a wall of logs into something I can skim. None of it needs a frontier model and none of it needs to leave my machine. That's the quiet shift underneath the loud launch: the floor has come up. The thing you can run yourself, privately, for free, is now good enough for a real slice of everyday work.
So the launch dominating tech this week is real and it's a big deal, and the bit worth your attention is not the 405-billion-parameter number that's filling the timelines. It's the smaller model you can actually run, and the steady, unglamorous fact that the baseline keeps rising. I'll take that over a trophy.