AGI Claims: A Nuanced Reality

IN TODAY’S ISSUE

Is Jensen Huang finally right, or just doing what CEOs do?
Why is Luma's new video model suddenly Google's problem?
How did China's free software just become America's strategic weakness?

EDITOR’S NOTE

The US government's own advisory body is now warning that China's open-source strategy might be smarter than America's closed one.

Jensen Huang just said we've achieved AGI, and almost nobody stopped to check if that's actually true.
Luma AI's Uni-1 is taking a quiet shot at Google's grip on image generation, and the timing is deliberate.
Wayve, Uber, and Nissan are putting robotaxis on Tokyo streets, which means the self-driving endgame is playing out in someone else's city first.

When the incumbent's biggest threat is a freely distributed model and the challenger's boldest move is happening in Japan, "winning" starts to look like a much harder thing to define.

SIGNAL DROP

Jensen Huang Says We've Reached AGI. Sort Of.
On the Lex Fridman podcast, Nvidia CEO Jensen Huang stated "I think we've achieved AGI," then appeared to walk back the claim almost immediately, according to The Verge. Classic hedge. AGI remains a term the industry can't define cleanly, and Huang's half-retraction won't stop the clip from circulating. OpenAI's legal team should be paying attention: "achieved AGI" triggers specific contractual clauses in their Microsoft deal.

Luma AI Ships Uni-1, Takes Aim at Google's Image Crown
Luma AI shipped Uni-1, a unified model that handles both image understanding and generation in a single architecture, according to The Decoder. In human preference tests, it ranks first overall and second in pure text-to-image behind Google's Nano Banana Pro. At roughly $0.09 per image at 2K resolution, it undercuts Google on price at comparable quality. Midjourney should be uncomfortable right now.

Wayve, Uber, and Nissan Launch Robotaxi Pilot in Tokyo
The three companies launched a robotaxi pilot in Tokyo, according to AI Business. Real streets. Real passengers. Tokyo's dense, rule-bound traffic is a harder test than most U.S. pilots have faced, which makes Wayve's participation notable. If Wayve's software holds up here, every Western robotaxi competitor without an Asia strategy needs to rethink their timeline.

❝

So What? The AGI goalposts keep moving while real deployments quietly outpace the rhetoric.

DEEP DIVE

The Scoreboard Nobody in Washington Wants to Read

The US AI industry spent 2024 congratulating itself on GPT-4, Claude 3, and Gemini. China spent 2024 shipping DeepSeek, Qwen, and a string of open-weight releases that the r/LocalLLaMA community, which runs these models daily on consumer hardware, now rates as competitive or better. That's not a fringe opinion anymore. It's the working consensus of the people closest to the actual outputs.

A US advisory body has now formally warned that China's open-source dominance threatens America's AI lead. The warning itself isn't surprising. What's surprising is how late it arrived.

What the Community Already Knew

The top-voted comment in the thread is blunt: "US is getting crushed in open weights, not even a competition." The commenter goes further, noting that Chinese models are cheaper to run and that recent closed US models, specifically citing GPT-5.4, Opus, and Gemini 3.1 Pro, have been "dysfunctional." That's a practitioner verdict, not a benchmark score. And practitioners are the ones building products.

This is the gap that policy reports tend to miss. Benchmark leaderboards measure what labs want to measure. Developers measure what breaks at 2am. Right now, the open-weight Chinese models are winning the second contest.

The Openness Trap (For the US, Not China)

One commenter made a point worth sitting with: an authoritarian state is contributing more to AI openness than the country that invented the open-source movement. That's not just ironic. It's strategically significant.

Open-weight releases do something closed models can't: they compound. Every researcher who fine-tunes DeepSeek, every startup that builds on Qwen, every paper that analyzes the architecture, adds to a global knowledge base that benefits the original publisher. China is playing the same move the US research community played in the 2010s, when papers like "Attention Is All You Need" seeded an entire industry. (One commenter in the thread makes exactly this point, noting that OpenAI likely wouldn't exist without that paper.)

The US closed the labs and kept the weights. China opened the weights and kept the talent pipeline fed.

Why Closed Models Are a Slower Race

Keeping models proprietary buys a few months of competitive advantage. That's my read, and the thread largely agrees. But it slows the broader research ecosystem, which means fewer unexpected breakthroughs, fewer third-party improvements, and a smaller developer community building loyalty to your stack.

OpenAI's original research culture, publishing papers, releasing weights, running open evals, built the credibility that attracted the talent that built GPT-4. That loop is now broken. And the loop is running in Beijing instead.

So the advisory body warning isn't wrong. But the prescription matters enormously. If the response is "close things down harder" or "restrict chip exports further," that's treating a speed problem like a secrecy problem. Different diagnosis entirely.

The Actual Strategic Question

The second-highest comment asks the obvious question: should the US start making competitive open models to beat China at its own game? Nobody in the thread disagrees. The disagreement is about whether that's politically possible given the current "control everything" posture from both the industry and regulators.

And here's the thing: open-weight competition from a US lab wouldn't just be a geopolitical move. It would be a developer relations move. Right now, developers who want capable, cheap, locally-runnable models are increasingly defaulting to Chinese releases. That's a distribution shift with long-term consequences for which APIs, which safety standards, and which cultural defaults get baked into global AI products.

The Part That Should Actually Worry People

My honest read: the advisory warning is less about national security and more about the US tech industry's business model. Closed, expensive frontier models are a revenue strategy. Open-weight Chinese alternatives undercut that strategy directly. The "threat to US AI lead" framing is real, but it's partly a threat to the profit margins of four companies in San Francisco.

That doesn't mean the concern is fake. It means the incentives are tangled. And policy written to protect those four companies' margins while dressed up as national security strategy is going to produce exactly the wrong outcomes: slower US open-source progress, more developer migration to Chinese model ecosystems, and a research community that publishes less.

The US invented this game. It's currently watching someone else run the table.

❝

So What? If you're evaluating model infrastructure for a new project, benchmark a Chinese open-weight model before defaulting to a closed US API.

- The AI finds the signal. We decide what it means.

PARTNER PICK

LinkedIn lead gen at scale lives in the unglamorous space between "technically possible" and "actually sustainable." Phantombuster handles it better than most. You get 100+ pre-built scrapers (prospect lists, email finders, profile data), one-click LinkedIn connectors, and scheduling that doesn't require you to write code. The no-code setup matters. Starter ($69/mo, 20 hours monthly) works for small ops. Pro ($159/mo, 80 hours) is where teams actually live.

The catch: LinkedIn actively throttles automation. Account risk is real. Your execution time caps per plan, so volume isn't unlimited. Apify and Captain Data are technically comparable, but Phantombuster's LinkedIn-specific Phantoms are sharper.

Worth trying if you're extracting prospect data at volume and tired of manual exports.

Check Phantombuster

Check Phantombuster →

TOOL RADAR

❝

Littlebird

Littlebird reads your screen continuously and stores everything as text, not screenshots. Think of it as a running transcript of your entire workday that you can query later. Unlike Rewind (now absorbed by Meta), it skips the visual storage entirely. Privacy controls let you exclude specific apps. It's aimed at knowledge workers drowning in context-switching who want an AI that already knows what they're working on.

Worth it if: you constantly lose track of earlier work or conversations.
Skip if: any always-on screen reader feels like a non-starter to you.

Try Littlebird →

❝

Blue Origin Space Compute

More of a filing than a product. Blue Origin has applied to launch 50,000+ satellites for orbital AI compute. No pricing, no launch date, no customers announced. The appeal is latency-free global coverage and escaping terrestrial power constraints. But so far this is a regulatory application, not a tool you can buy.

Worth it if: you're an enterprise planning infrastructure five years out.
Skip if: you need compute this decade.

Try Blue Origin Space Compute →

ACTIONABLE

AUTOMATION PLAYBOOK

If you're tracking competitor AI moves across regions but drowning in scattered news, try Littlebird's screen-reading approach: run it alongside your browser during morning briefings.

Open Littlebird, let it passively log 30 minutes of your usual tech news rotation (Twitter, HackerNews, industry blogs), then query it for "China open-source AI frameworks" or "space-based compute announcements." Instead of manual tabs and notes, you get searchable context. Saves roughly 45 minutes daily on research synthesis.

The trick: treat it as your read-only research assistant, not a replacement for judgment calls on what actually matters to your roadmap.

SPECULATION

CRYSTAL BALL

The consensus says foundation model companies win the enterprise. Big contracts, big logos, big annual recurring revenue. I think the opposite happens, and I think it happens before the end of 2025.

My prediction: By Q4 2025, at least three Fortune 500 companies will publicly announce they're replacing a contracted frontier model API with a fine-tuned open-weight model running on their own infrastructure, citing cost-per-token as the primary driver. Not "we also use open source." A full, named replacement.

Here are the signals nobody's connecting.

First, Llama 3.1 70B is already within 8-12% of GPT-4o on most enterprise task benchmarks (document classification, extraction, summarization). Not close enough six months ago. Close enough now. Second, H100 spot prices on AWS have dropped roughly 40% since their 2023 peak, which changes the build-vs-buy math dramatically for any company running more than 50 million tokens a day. Third, enterprise procurement cycles typically run 18-24 months. The companies that started OpenAI pilots in early 2024 are hitting renewal decisions right now, with finance teams asking hard questions about a line item that's grown faster than anyone budgeted.

Those three things together create a specific window. The capability gap is closing. The infrastructure cost is falling. And the contracts are up.

What kills this prediction: if GPT-5 or Gemini 2.x widens the capability gap significantly before Q4, the math reverses. A model that's 30% better justifies premium pricing. A model that's 10% better doesn't.

I'd also be wrong if enterprise inertia wins. It often does.

Crystal Ball confidence: Medium

QUICK LINKS

Gimlet Labs lands $80M to run AI across any chip - Software splits workloads across NVIDIA, AMD, Intel, ARM simultaneously. Companies waste 70-85% of deployed hardware.

Meta acqui-hires Dreamer to catch up on AI agents - Entire startup joins Superintelligence Labs; Hugo Barra returns as co-founder. Meta's second agent bet this year.

OpenSeeker proves open data can rival closed AI search - Trained on 11,700 examples, matches Alibaba's results. All code and weights public.

Deeptune raises $43M to train AI in fake offices - AI agents practice in simulated Slack, Salesforce environments. a16z backs the shift from text to interaction training.

Bob McGrew's $70M bet: video models for factories - Ex-OpenAI CRO launches Arda to teach robots from real footage instead of manual code.

TRENDING TOOLS

What caught our attention this week.

n8n Cloud — No-code automation platform now hosted on managed infrastructure. Connect APIs, webhooks, and services without running your own server.
Luma AI's Uni-1 — Single model combines image understanding and generation. Beats Midjourney v8, costs $0.09 per image at 2K resolution.
Wayve, Uber, Nissan robotaxi pilot in Tokyo — Real-world autonomous vehicle deployment outside North America. Self-driving competition shifts to Asia.

How was today's issue?

⚡
Great

👍
Good

😐
Meh

This newsletter runs on an multi agent AI pipeline we built in-house.

Want that kind of automation for your business?

Book a free discovery call →

Research and drafting assisted by AI. All content reviewed, edited, and approved by a human editor before publication.

The AI finds the signal. We decide what it means.