llama.cpp was already public by March 10, 2023. Ollama-the-company may have existed earlier through YC Winter 2021, but that is not the same thing as having a public local-LLM runtime before llama.cpp. In fact, Ollama’s own v0.0.1 repo says: “Run large language models with llama.cpp” and describes itself as a “Fast inference server written in Go, powered by llama.cpp.” Ollama’s own public blog timeline then starts on August 1, 2023 with “Run Llama 2 uncensored locally,” followed by August 24, 2023 with “Run Code Llama locally.” So the public record does not really support any “they were doing local inference before llama.cpp” narrative.
And that is why the attribution issue matters. If your public product is, from day one, a packaging / UX / distribution layer on top of upstream work, then conspicuous credit is not optional. It is part of the bargain. “We made this easier for normal users” is a perfectly legitimate contribution. But presenting that contribution in a way that minimizes the upstream engine is exactly what annoys people.
The founders’ pre-LLM background also points in the same direction. Before Ollama, Jeffrey Morgan and Michael Chiang were known for Kitematic, a Docker usability tool acquired by Docker on March 13, 2015. So the pattern that fits the evidence is not “they pioneered local inference before everyone else.” It is “they had prior experience productizing infrastructure, then applied that playbook to the local-LLM wave once llama.cpp already existed.”
So my issue is not that Ollama is a wrapper. Wrappers can be useful. My issue is that they seem to have taken the social upside of open-source dependence without showing the level of visible credit, humility, and ecosystem citizenship that should come with it. The product may have solved a real UX problem, but the timeline makes it hard to treat them as if they were the originators of the underlying runtime story.
They seem very good at packaging other people’s work, and not quite good enough at sounding appropriately grateful for that fact.
I think my reaction is mostly puzzlement. I can see a sensible point or several in the article, but I was not always sure how big a point the author was trying to make.
At the narrower level, it seems to be saying that benchmarks are easier to interpret when you know what they really are. That makes sense. If a circuit is known to be a multiplier, that tells you more than if it is just called `c6288`.
That is also why I thought of Python benchmarks. In something like `pyperformance`, names such as `json_loads`, `python_startup`, or `nbody` already tell you something about the workload. So when you compare results, you have a better sense of what kind of task a system is doing well on. But so what? It is just benchmarks. They don't guarantee anything about anything anyway.
What made it harder for me to follow was that this fairly modest point is wrapped in a lot of jokes and swipes about AI and corporate AI language. Some of that is funny, but it also made me less sure what the main point was supposed to be. Was the article really about benchmark interpretation, or was that mostly a vehicle for making a broader point about AI hype and technical understanding?
So I do think there is a real point in there. I just found it slightly hard to separate that point from the style and the jokes.
I misread “uncrewed” as “unscrewed” and for a moment this became a much stranger, better aerospace story. Not autonomous aircraft, but aircraft apparently liberated from screws. A future of pilotless aircraft is plausible enough; a future of screwless aircraft is much weirder.
Not as weird as one might think, fasteners produce local loads and require holes, so designing without them would be much better. It has been a goal for decades but progress is slow! Maybe uncrewed vehicles can be iterated on more rapidly.
That diagram is rather bad at what it tries to do. Those are also historically and phonetically the same:
Λ Л
Δ Д
Κ К
The first Cyrillic alphabet was using the https://en.wikipedia.org/wiki/Glagolitic_script , curiously created by Saint Cyril, but then people found it was too difficult, so someone in the Preslav Literary School in the First Bulgarian Empire mashed up Glagolitic, Greek and Latin to create the new Cyrillic (probably naming it as a sorry to Cyril for butchering his nice unique alphabet).
"VMware will end support for version 8.0 of its products on October 11, 2027, just a few weeks after the end of the two-year period Tan mentioned. The Register often hears that organizations contemplating a move away from VMware, or reducing their use of the product, have circled that date on their calendars as a deadline for migration projects to alternative platforms."
llama.cpp was already public by March 10, 2023. Ollama-the-company may have existed earlier through YC Winter 2021, but that is not the same thing as having a public local-LLM runtime before llama.cpp. In fact, Ollama’s own v0.0.1 repo says: “Run large language models with llama.cpp” and describes itself as a “Fast inference server written in Go, powered by llama.cpp.” Ollama’s own public blog timeline then starts on August 1, 2023 with “Run Llama 2 uncensored locally,” followed by August 24, 2023 with “Run Code Llama locally.” So the public record does not really support any “they were doing local inference before llama.cpp” narrative.
And that is why the attribution issue matters. If your public product is, from day one, a packaging / UX / distribution layer on top of upstream work, then conspicuous credit is not optional. It is part of the bargain. “We made this easier for normal users” is a perfectly legitimate contribution. But presenting that contribution in a way that minimizes the upstream engine is exactly what annoys people.
The founders’ pre-LLM background also points in the same direction. Before Ollama, Jeffrey Morgan and Michael Chiang were known for Kitematic, a Docker usability tool acquired by Docker on March 13, 2015. So the pattern that fits the evidence is not “they pioneered local inference before everyone else.” It is “they had prior experience productizing infrastructure, then applied that playbook to the local-LLM wave once llama.cpp already existed.”
So my issue is not that Ollama is a wrapper. Wrappers can be useful. My issue is that they seem to have taken the social upside of open-source dependence without showing the level of visible credit, humility, and ecosystem citizenship that should come with it. The product may have solved a real UX problem, but the timeline makes it hard to treat them as if they were the originators of the underlying runtime story.
They seem very good at packaging other people’s work, and not quite good enough at sounding appropriately grateful for that fact.
reply