There's going to be a proliferation of foundational models. The cost of those models will go to zero.View on YouTube
Chamath’s claim had two parts: (1) many foundational models would proliferate; (2) the marginal cost of using them would trend toward (effectively) zero, commoditizing access, over the next several years.
1. Proliferation of foundational models – clearly happening
Since late 2023 there has been an explosion of base/model families with released weights:
- Meta’s Llama series (2, 3, 3.x, 4 variants), Google’s Gemma (1/2/3 plus PaliGemma, MedGemma), Alibaba’s Qwen (multiple generations, dense and MoE, including Qwen3 with many parameter sizes), DeepSeek, Mistral, DBRX, and others, many under permissive licenses or open-weight terms.
- Alibaba alone reports 100+ open‑weight Qwen models with over 40 million downloads.(en.wikipedia.org)
- Google DeepMind’s Gemma line shows repeated open‑weight releases (Gemma 1–3 and variants) specifically intended as widely usable foundation models.(en.wikipedia.org)
- A 2025 study of Hugging Face’s ecosystem notes >1.8 million models hosted by June 2025, indicating massive proliferation of base and derivative LLMs.(arxiv.org)
- Domain‑specific foundational models like ORANSight‑2.0 build on 18 open LLMs, illustrating how many base models are now available to specialize.(arxiv.org)
On this dimension, reality strongly matches the prediction: there is a proliferation of foundational models from many vendors and communities.
2. Cost trending toward (but not reaching) “zero”
Multiple independent indicators show a dramatic fall in per‑token AI usage costs:
- Analyses of LLM pricing find that, for a given quality level, inference cost has fallen by roughly 10× per year, amounting to a 1,000× drop over about three years (e.g., from ~$60 to ~$0.06 per million tokens for models with similar benchmark scores).(thestack.technology)
- Sam Altman has publicly described a ~10× annual decline in the cost of using AI, citing a 150× reduction in token cost from GPT‑4 (early 2023) to GPT‑4o (mid‑2024).(businessinsider.com)
- Model‑as‑a‑service providers like Together.ai now sell competent open‑weight models such as Llama 3.2 3B at about $0.06 per million tokens, which is effectively near‑free for many applications.(together.ai)
However, costs have not literally gone to zero, and access to top frontier models (OpenAI, Anthropic, Google, etc.) still commands a clear price premium. Infrastructure scarcity and capacity crunches (e.g., AWS Bedrock’s 2025 GPU shortages) also show that the underlying compute is still economically scarce, which supports ongoing positive margins rather than pure commodity pricing.(businessinsider.com)
3. Why the verdict is “inconclusive”
The prediction explicitly referred to a horizon of “the next several years.” As of 30 November 2025, only about two years have passed since December 2023:
- The direction of change strongly supports Chamath’s view: there is a clear proliferation of foundational models and a steep trend toward very low marginal costs.
- But the end state—foundational model access being effectively commoditized, with costs approaching zero—has not yet fully materialized. Frontier models remain differentiated products with meaningful pricing power, and it is not yet clear whether the market will settle into full commoditization or an oligopoly of high‑end providers sitting atop a commoditized open‑model layer.
Because the forecast window (“several years”) has not fully elapsed and the ultimate market structure is still unsettled, the fairest grading as of late 2025 is “inconclusive”—the evidence so far leans in favor of his thesis but does not definitively confirm its final outcome.