Last updated Nov 29, 2025

New SEC Chair, Bitcoin, xAI Supercomputer, UnitedHealth CEO murder, with Gavin Baker & Joe Lonsdale

Sat, 07 Dec 2024 00:55:00 +0000
Back to episodes
economyclimate
Within approximately 10 years from the time of this recording (by ~2034), China will have roughly four times (4x) the electricity production capacity of the United States.
and that ultimately leads to a situation where in ten years, we're going to be looking across the water at, you know, a competitive country that has Rex forex. The electricity production capacity of our countryView on YouTube
Explanation

The prediction has a horizon of ~10 years from the recording date of 7 Dec 2024, i.e., until about 2034. As of today (30 Nov 2025), fewer than 2 years have elapsed, so it is too early to say whether the forecast will prove right or wrong.

Current data show that China already has larger total power generation capacity than the U.S., but not at 4x yet. Various international energy statistics (e.g., from the IEA, U.S. EIA, and other energy data compilers) indicate that in recent years China’s installed power capacity and electricity generation have been on the order of about 2× that of the U.S., depending on the year and metric (installed capacity vs. actual generation), but not 4×. However, this only describes the situation now, not in 2034.

Because the forecast explicitly concerns the situation around 2034, and that date has not yet occurred, the truth value of the prediction cannot be determined at this time. Therefore the status is inconclusive (too early).

aitechscience
The performance of Grok 3, trained on xAI's ~100,000-GPU Colossus cluster, will provide a decisive empirical test within its first training run of whether existing AI training scaling laws continue to hold or are starting to break down.
Grok three is a big card and will resolve this question of whether or not we're hitting a wall.View on YouTube
Explanation

Grok 3 was in fact trained on xAI’s Colossus supercomputer at the ~100,000–200,000‑GPU scale (far beyond the ~20–32k‑GPU clusters that previous frontier models used), providing precisely the kind of new high‑compute data point Gavin Baker and Friedberg were talking about. (rdworldonline.com) Multiple analysts explicitly framed Grok 3 as an experiment on scaling laws: Tencent/WallstreetCN’s post‑mortem describes Grok 3’s 100k+ H100 run as showing that pre‑training scaling laws “have not hit a ceiling” and still improve performance, albeit with poor cost‑effectiveness, concluding that “Scaling Law did not hit a wall.” (longbridge.com) Technical explainers and blogs similarly note that Grok 3’s performance gains over Grok 2, given roughly 10x more compute, are largely what standard scaling‑law curves predict—clear improvement but not a dramatic new “generation” jump—indicating continued validity of scaling rather than a breakdown. (uxtigers.com) Ethan Mollick characterized Grok 3 as the first release we know used an order of magnitude more training compute than GPT‑4‑class models and said it would help test whether the first (pre‑training) scaling law still holds; after the results, he wrote that Grok 3 landed right at expectations, with no need to revise the consensus that more compute still buys more capability. (ethanbholland.com) While some online discussions still debate whether we are near a “wall,” the empirical outcome from Grok 3 is broadly read as evidence that scaling laws remain intact at this larger scale rather than having already failed, which is exactly the kind of resolution Friedberg predicted Grok 3 would provide.

aitech
Regardless of whether current training scaling laws continue to hold, there will be at least 10 more years of significant AI innovation driven by other axes such as inference-time compute, context window expansion, and architectural improvements.
Even if scaling laws for training break, we have another decade of innovation ahead of us.View on YouTube
Explanation

The prediction’s time horizon is ~10 years from the podcast release date of 7 Dec 2024. Adding 10 years gives 2034 as the endpoint, which is well beyond the current date of 30 Nov 2025. Since fewer than two years have elapsed, we cannot yet assess whether there will indeed be “another decade of innovation ahead of us,” regardless of whether training scaling laws break. Therefore, the status of this prediction is inconclusive (too early to tell).

aitech
If current AI scaling laws continue to hold, xAI's Grok 3 model will surpass OpenAI/Microsoft’s best publicly available frontier model and become the state-of-the-art general-purpose LLM by January or February 2025.
Grok three should take the lead if scaling laws hold in January or February.View on YouTube
Explanation

By late February 2025, there was substantial but conflicting evidence about whether Grok 3 had truly become the state‑of‑the‑art general‑purpose LLM relative to OpenAI/Microsoft’s best publicly available frontier models.

Evidence that supports the prediction

  • xAI launched Grok 3 around February 17–20, 2025, describing it as their new flagship model trained with roughly 10× the compute of Grok 2 and claiming it surpasses OpenAI on benchmarks like AIME (math) and GPQA (PhD‑level science). (es.wikipedia.org)
  • xAI’s published benchmarks and coverage in outlets like Beebom and ZeroHedge report Grok 3 outperforming GPT‑4o, Claude 3.5 Sonnet, Gemini 2.0 Pro, and DeepSeek V3 on AIME 2024, GPQA Science, and LiveCodeBench, and also beating OpenAI’s o3‑mini on some reasoning benchmarks. (beebom.com)
  • Grok 3 (under the alias “chocolate”) reached the #1 position on the LMSYS Chatbot Arena with an Elo score around 1400–1402, ahead of GPT‑4o and DeepSeek R1, which xAI and supporters framed as evidence it was the top chatbot overall. (twitter.com)

Evidence that cuts against the prediction

  • Independent commentary after launch emphasized that Grok 3 was competitive but not clearly dominant. Ethan Mollick (Wharton) described Grok 3 as a “very solid frontier model” but not a clear leader and “not one you would stop using your current frontier model for,” adding that while it beats some OpenAI models on selected benchmarks, it does not clearly surpass OpenAI’s o3. (aol.com)
  • Gary Marcus similarly argued that Elon Musk’s promise that Grok 3 would be “the smartest AI ever” was not borne out, calling the launch “no game changer” relative to OpenAI’s best models. (aol.com)
  • Comparative write‑ups on Grok 3 vs. OpenAI’s o3 report a mixed picture: Grok 3 slightly leads on some math benchmarks (e.g., AIME 2025 under heavy consensus sampling), while o3 (and related OpenAI reasoning models) lead on others, such as Codeforces coding Elo and certain software‑engineering tasks. These articles also note concerns that xAI’s benchmark setups (e.g., very expensive consensus sampling for Grok 3) aren’t perfectly comparable to how OpenAI models are typically evaluated, making it hard to declare an overall winner. (portotheme.com)

Why this is rated ambiguous The prediction effectively claims that by January/February 2025 Grok 3 would surpass OpenAI/Microsoft’s best publicly available frontier model and be the state‑of‑the‑art general‑purpose LLM. By late February 2025, Grok 3:

  • Was clearly a top‑tier frontier model and #1 on one prominent user‑preference leaderboard (Chatbot Arena). (twitter.com)
  • But faced credible, well‑publicized expert assessments and technical comparisons saying it did not clearly surpass OpenAI’s leading models overall.

Because “state‑of‑the‑art general‑purpose LLM” is not defined by a single universally accepted metric, and high‑quality sources disagree—some framing Grok 3 as SOTA and others explicitly saying OpenAI remained ahead or at least not clearly behind—the outcome of the prediction cannot be determined in a definitive, objective way. Hence the result is best characterized as ambiguous, rather than clearly right or clearly wrong.

aieconomy
By roughly 2026–2027, training a single top-tier frontier AI model at the cutting edge will require on the order of US$100 billion in total training cost (hardware, energy, and associated infrastructure).
Particularly if it's going to cost $100 billion to train a model in 2 or 3 years, which I think is a realistic estimate.View on YouTube
Explanation

The prediction is about the future ("in 2 or 3 years" from a December 7, 2024 podcast, i.e., roughly 2026–2027). As of today (November 30, 2025), those years have not yet occurred, so we cannot definitively say whether a single top-tier frontier AI model will cost on the order of US$100 billion to train.

Public reporting to date on frontier model training costs (e.g., estimates for models like GPT‑4 and successor projects) generally puts total training expenditures in the range of hundreds of millions to low single-digit billions of dollars per model—not anywhere near $100 billion yet. These figures are based on industry analyses and journalistic estimates of GPU usage, cloud spending, and datacenter investments, and even very large announced AI infrastructure projects (e.g., multi‑tens‑of‑billions datacenters) are spread across multiple years and many models, not a single training run. However, these current numbers do not resolve whether by 2026–2027 costs for a single state‑of‑the‑art model will reach ~$100B.

Because the time window for the prediction has not arrived and the cost level it refers to has not yet been (dis)proven by events in those years, the correct classification is: it is too early to tell.

By sometime in 2025, natural human language (e.g., English prompts and instructions) will be the primary interface used by most developers and many non-developers to create and modify software, effectively making human language the dominant programming language in practice.
I think next year, the human language will be the dominant programming language.View on YouTube
Explanation

As of late 2025, surveys and industry data show that natural‑language prompting is a widely used assistant, not the primary interface for most developers, so Friedberg’s claim that “human language will be the dominant programming language” in 2025 has not materialized. Stack Overflow’s 2025 Developer Survey reports that 84% of respondents use or plan to use AI tools, but many use them only occasionally, 46% explicitly distrust AI output, and about three‑quarters still prefer asking human colleagues over AI for help, indicating that AI (and thus natural‑language prompts) is supplementary rather than central. (techradar.com) Studies of GitHub Copilot and large enterprise deployments find that roughly 20–33% of accepted code lines come from AI and that even at Microsoft, AI contributes only around 30% of code in some projects—meaning most code is still written and edited directly in conventional languages like Python and JavaScript. (arxiv.org) While 2025 has seen the rise of “vibe coding” and rhetoric that human language is becoming a new kind of programming language, these are consistently described as emerging trends, used mainly for prototyping or specific workflows, and accompanied by significant concerns around security and reliability, not as the default way the majority of professional software is produced. (rollingai.news) Some industry analyses claim that around 41% of global code in 2024 was AI‑generated and that a large share of vibe‑coding users are non‑developers, but even these optimistic estimates imply that most code is still authored through traditional programming and apply to subsets of teams rather than “most developers” worldwide. (secondtalent.com) Overall, natural‑language interfaces have become important and fast‑growing in 2025, but the best available evidence shows they augment rather than replace conventional programming languages, so the prediction is best judged as wrong.