Last updated Nov 29, 2025
Chamath @ 01:08:01Inconclusive
aimarkets
Over time, both open-source and closed-source AI model providers will be forced by competition and open-source alternatives to drive their per‑token API pricing effectively to (near) zero above compute cost for large enterprise customers.
I think what Aaron is saying here, let me let me maybe try to frame it. I think what he's saying is they'll be open source models, they'll be closed source models. But the price that Aaron or me or anybody else pays, these model makers will effectively go to zero.
Explanation

As of November 30, 2025, there is strong evidence of rapid price compression and intense competition, but not enough time or data to say Chamath’s long‑run structural claim has clearly succeeded or failed.

What the prediction requires

  • He isn’t just saying prices will fall; he’s saying that for large enterprises, both closed and open‑source model providers will eventually price API usage at (effectively) compute cost, with near‑zero margin per token.
  • The phrase “over time” gives no concrete horizon (e.g., 2 years vs. 10 years), so it’s a long‑term industry-structure prediction.

Where pricing actually is in late 2025

  1. Closed‑source leaders still charge non‑trivial per‑token prices:

    • OpenAI’s public pricing for major models (e.g., o3, o4‑mini, gpt‑4o‑mini) remains in the roughly $0.15–$20 per million tokens range, depending on model and tier, well above zero and with clearly positive gross margins. (platform.openai.com)
    • Anthropic Claude 4.x models (Opus, Sonnet, Haiku 4.5) list at about $0.80–$15 per million input tokens and $4–$75 per million output tokens, again indicating substantial markups over raw compute. (claudelog.com)
    • Google’s Gemini API charges around $0.15 per million input tokens (with additional output pricing), not zero, even if cheaper than some rivals. (ai.google.dev)
  2. Gross margins show providers are still earning more than bare compute cost:

    • Industry analyses estimate OpenAI’s model APIs running at double‑digit to ~50% gross margins, and Anthropic in a similar ~50–60% range—far from “near‑zero” margin over compute. (getmonetizely.com)
    • That implies enterprises are still paying meaningfully above underlying GPU/TPU costs, even after volume discounts.
  3. Open‑source and low‑cost competitors are putting pressure on prices:

    • DeepSeek offers extremely low API prices (on the order of $0.55 per million input tokens and $2.19 per million output tokens as of early 2025) and repeatedly cuts prices 50–75% in “price war” moves, explicitly leveraging open‑source releases to undercut Western providers. (en.wikipedia.org)
    • These moves have triggered price cuts from OpenAI and Google and pushed the whole market towards cheaper tiers and smaller, more efficient models, aligning with the direction of Chamath’s thesis (competition and open‑source driving prices down).

Why this is still "too early" rather than clearly wrong

  • The current state: prices are falling, there’s a visible race to the bottom on lower‑tier models, and open‑source/China‑based providers have made very cheap inference widely available. That supports the trend he described.
  • But the end state he claimed—per‑token enterprise pricing effectively at compute cost across both closed and open providers—has not been reached: list prices and margin estimates show meaningful markups today, and none of the major Western vendors publicly commit to “at‑cost” enterprise tokens. (platform.openai.com)
  • Because he did not specify a timeframe, and we’re only ~11 months past the December 20, 2024 episode, the market could still evolve toward his asymptotic scenario (e.g., if vendors shift to primarily charging for dedicated capacity/SLAs while metered tokens trend toward cost).

Given the open‑ended timeline and the fact that pricing is clearly moving downward but is not yet at near‑zero margin, the fairest assessment as of late 2025 is **“inconclusive (too early)” rather than definitively right or wrong.