Why small models matter more than you think
The 70B-vs-405B argument misses the point. What you actually want is predictable behavior on a narrow slice of reality, and smaller is where that lives.
Essays on conversational AI, open-source infrastructure, and the unglamorous work of building software that doesn't embarrass you in production.
Doing chatbots since before they were good.
The 70B-vs-405B argument misses the point. What you actually want is predictable behavior on a narrow slice of reality, and smaller is where that lives.
You already wrote the requirements — you just called them tests. A practical argument for treating eval datasets as the contract between PM and model.
Nobody tweets about retries, timeouts, and schema drift. That's where the real margin lives if you're shipping to customers who pay you actual money.
A talk at QCon London on why writing good conversations is a UX craft with forty years of theory behind it — and why LLMs did not rewrite any of it.
If you can't describe what your bot should do in a sentence,
you probably can't ship it either.