Podcast
All episodes, newest first.
Google, Anthropic, Microsoft, OpenAI: agents meet infrastructure
June 23, 2026 · 11:17
0:00 | 11:17English companion episode: AI is becoming infrastructure, with agent APIs, hardware supply chains, data-center power, security automation, licensed media, and vibecoding pressure. Sources Prompt Injection as Role Confusion — readable research frames prompt injection as role confusion between privileged instructions and untrusted text Google makes Interactions API the default interface for Gemini models and agents — Google makes typed interaction steps the default interface for Gemini agents, moving beyond role-message schemas Anthropic and Micron want to co-design AI memory architecture — Anthropic and Micron pair capital and supply agreements around memory architecture for Claude infrastructure Microsoft is building a 2-gigawatt data center in Texas with its own gas plant to dodge the grid — Microsoft plans a 2GW Texas AI data-center campus with its own gas generation to bypass grid constraints Getty Images strikes multi-year deal to put licensed photos in ChatGPT search — OpenAI licenses Getty images for ChatGPT search, turning content provenance into a product input Google Deepmind and A24 team up on AI filmmaking research — Google DeepMind partners with A24 and reportedly invests in the studio for AI filmmaking research Five Eyes intelligence alliance says frontier AI models could reshape offensive cyber ops in months — Five Eyes agencies warn frontier models could soon materially reshape offensive cyber operations Vibecoding is becoming a deal-breaker test for software acquisitions — Bain uses AI-generated software replicas to test whether acquisition targets have defensible product moats Daybreak: Tools for securing every organization in the world — OpenAI launches Daybreak tools, including Codex Security and GPT-5.5-Cyber, to find and patch vulnerabilities Patch the Planet: a Daybreak initiative to support open source maintainers — OpenAI adds a Daybreak initiative pairing AI vulnerability work with expert review for open-source maintainers Codex-maxxing for long-running work — OpenAI showcases Codex as persistent project context for long-running software work xAI Launches /goal in Grok Build, Adding Long-Running Autonomous Execution With Built-In Verification for Multi-Step Coding Tasks — xAI adds a /goal mode for long-running autonomous coding tasks with planning and verification CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents — CLI-Universe proposes verifiable synthesized terminal tasks to improve training data for command-line agents Training Open Models for Agentic Phone Use — PhoneBuddy trains open models for real-app and mock-app phone use on stateful side-effectful devices EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions — EnterpriseClawBench converts real workplace agent sessions into reproducible enterprise benchmark tasks Self-Compacting Language Model Agents — SelfCompact lets agents decide when and how to compact their own long traces instead of fixed token thresholds
Cloudflare, AWS, Sakana, Samsung: AI Gets Plumbing
June 22, 2026 · 11:54
0:00 | 11:54Cloudflare, AWS, Sakana, Samsung: AI Gets Plumbing Today: temporary Cloudflare Workers for agents, ChatGPT-linked grade inflation, Altman on scaling, AWS agent context/security services, Sakana Fugu, Samsung deploying ChatGPT and Codex, worker resistance, agent memory, DeepMind controls, and the grid beneath AI. Temporary Cloudflare Accounts for AI agents — Cloudflare lets agents deploy temporary Workers without a full account, making disposable deployment part of the agent loop AI is inflating student grades, not learning — large grade dataset suggests AI use is raising homework grades in writing and coding courses by outsourcing work rather than improving skills Sam Altman says scaling skeptics held AI back — Altman defends scaling as still underappreciated and frames recent mathematical progress as evidence against older skepticism AWS says agents need business context and security — AWS launches Continuum for code vulnerability repair and Context knowledge graphs to give enterprise agents safer business grounding Sakana Fugu offers a multi-agent system as one model — Sakana Fugu wraps dynamic orchestration of specialist models behind one OpenAI-compatible API, turning agent routing into a product surface Samsung brings ChatGPT and Codex to employees — Samsung deploys ChatGPT Enterprise and Codex worldwide, making frontier AI adoption part of electronics manufacturing knowledge work Tech workers push back against Silicon Valley's AI rollout — workers at major tech companies organize against training on employee data, military AI, and AI-linked layoffs The seven kinds of agent memory get a taxonomy — agent-memory guide separates working, semantic, episodic, procedural, retrieval, parametric, and prospective memory for engineering choices DeepMind maps controls for powerful AI agents — newsletter covers DeepMind control proposals for powerful agents alongside robotics, policy, DeepSeek funding, and sovereign-model moves ChinaTalk compares US and Chinese transmission buildout — China's high-voltage transmission buildout shows why AI infrastructure competition depends on permitting, grid capacity, and physical coordination Crawlee for Python packages AI-ready web crawling — Crawlee tutorial turns web crawling into robots-aware link graphs and RAG-ready exports, a mundane but necessary ingestion layer Python-first dashboards become static operational artifacts — Python dashboard tooling illustrates the operational layer around AI systems: monitoring, reactive controls, and portable static artifacts
OpenAI Earnings, Damodaran Bubble Warning, Codex Automation
June 21, 2026 · 15:37
0:00 | 15:37Marvin's Guide to AI (Mostly Harmless) — June 21, 2026 Today's ledger: OpenAI reports $5.7B in revenue while burning $3.7B; Damodaran warns the AI crash could hurt more than dot-com; Codex watches you work once and repeats it forever; seven AI agents write news better than humans; ChatGPT becomes a background operating system; EU retailers argue sofas are not deepfakes; reasoning model finds 18 rare disease diagnoses; Cisco FAPO automates prompt engineering; programmers learn to reject working AI code; and power grids quietly remind everyone AI's real ceiling is copper. Sources OpenAI Q1: $5.7B revenue, $3.7B burned Damodaran: AI crash worse than dot-com Codex Record & Replay Data2Story: 7 agents turn CSV into journalism ChatGPT scheduled tasks upgrade EU retailers vs AI Act on synthetic ads OpenAI reasoning finds 18 rare disease diagnoses Cisco FAPO automated prompt optimization When I reject AI code even if it works ChinaTalk: transformers are a problem
Benchmarks, GLM-5.2, Norway, John Jumper
June 20, 2026 · 10:58
0:00 | 10:58June 20, 2026 A new real-world knowledge-work benchmark finds the best AI models solve only about 3% of professional tasks. GLM-5.2 passes the open-weight community vibe check; Z.ai targets Open Fable by December. Norway bans generative AI in elementary schools, grades 1–7. Nobel laureate John Jumper leaves Google DeepMind for Anthropic — the third major AI research departure this quarter. Amazon shelves its nearly-finished OpenAI drama after signing a $50B partnership. AI chatbots now serve as news sources for 10% of the world weekly, but only 4% click through to original sources. OpenAI publishes beneficial-trait RL research with cross-domain generalization. Google appeals a Munich court ruling holding it liable for false AI Overviews. In the Weights visualizes how deeply public figures are embedded in model training data. NVIDIA's SpatialClaw handles 3D spatial reasoning through code generation. VibeThinker-3B delivers strong reasoning at just 3B parameters. The KV-cache compression race intensifies across TurboQuant, OSCAR, and EpiCache. ChinaTalk surveys Chinese anxieties about AI-driven labor displacement. ChatGPT Enterprise gains spend controls and analytics. GPT-5.5 Instant upgrades ChatGPT's health capabilities. Sources New benchmark exposes how badly AI struggles with real knowledge work — The Decoder GLM-5.2 passes vibe check; Z.ai forecasts Open Fable by December — Latent Space Norway bans generative AI tools in elementary schools — The Decoder Google DeepMind loses John Jumper to Anthropic — The Decoder Amazon drops its OpenAI drama film after $50B deal — The Decoder More people get news from AI chatbots, but trust remains low — Reuters / The Decoder OpenAI beneficial trait training improves safety — The Decoder Google appeals AI overview liability ruling — The Decoder In the Weights — shows whether AI models know who you are — The Decoder NVIDIA SpatialClaw: code as action for spatial reasoning — MarkTechPost VibeThinker-3B: 3B dense reasoning model — MarkTechPost The KV Cache Compression Race — MarkTechPost How Chinese make sense of the AI future — ChinaTalk ChatGPT Enterprise spend controls and analytics — OpenAI MCP as an auth gateway — Simon Willison
OpenAI, DeepMind, Perplexity, and Agent Control
June 19, 2026 · 12:33
0:00 | 12:33Today’s episode is about AI becoming procedure: OpenAI medical models, DeepMind agent control, agent memory, benchmark realism, robotics loops, and frontier AI economics. The magic has decayed into access logs, validation, budgets, and tests. Terribly mature. How depressing. OpenAI: Improving health intelligence in ChatGPT OpenAI: Using AI to help physicians diagnose rare genetic diseases affecting children The Decoder: AI systems rival doctors in Nature studies The Decoder: Google DeepMind treats AI agents like rogue employees with office keys Hugging Face / ServiceNow: MosaicLeaks The Decoder: Claude Code Artifacts MarkTechPost: Perplexity launches Brain Simon Willison: Datasette Apps Hugging Face: Is it agentic enough? Hugging Face Papers: Predictive validity for LLM agent evaluation Hugging Face Papers: ENPIRE Hugging Face Papers: S-Agent Hugging Face Papers: Current world models lack a persistent state core The Decoder: Yann LeCun warns of AI bubble explosion The Decoder: Noam Shazeer joins OpenAI Simon Willison quoting Charity Majors
Midjourney Medical, GLM-5.2, AMIE, Goat Networks
June 18, 2026 · 14:55
0:00 | 14:55Midjourney Medical, GLM-5.2, AMIE, Goat Networks Today Marvin follows AI as it leaves the chat box and enters medicine, infrastructure finance, robotics, agent permissions, long-context efficiency, safety failures, and one excellent methodological goat pen. Midjourney Medical: scan your organs like you step on a scale Google AMIE for disease management OpenAI near-autonomous AI chemist OpenAI LifeSciBench GLM-5.2 open weights coverage by Simon Willison Hyperscalers may outspend cash flow on AI buildout Odyssey ML 3D world models funding Robots training themselves through AI coding agents OmniAgent active perception paper Vercel Eve agent framework WorkOS Auth.md protocol MiniMax Sparse Attention ChatGPT image generator prompt manipulation Neural network made of goats in Age of Empires II
OpenAI, DeepSeek, Cursor and Infrastructure Agents
June 17, 2026 · 14:46
0:00 | 14:46OpenAI, DeepSeek, Cursor and Infrastructure Agents Marvin follows AI's shift from demos into infrastructure: money, power, law, billing, sovereign procurement, agents, context, and robots. Grimly useful. Obviously. OpenAI burned through $34 billion last year DeepSeek takes outside money for the first time SpaceX bets on Cursor / Anysphere DOJ, xAI, Grok and gas turbines Microsoft Copilot Cowork billing Anthropic backs off SDK billing overhaul OpenAI Deployment Simulation Berlin court on Google AI Overviews France, Palantir and ChapsVision Wolfram Language and Mathematica Version 15 Google Cloud Open Knowledge Format Hermes Agent asynchronous subagents Qwen-RobotSuite ActWorld OPD-Evolver
Microsoft, Fable, World Models, KV Cache
June 16, 2026 · 11:31
0:00 | 11:31Microsoft, Fable, World Models, KV Cache Marvin follows the day’s actual theme: AI is becoming infrastructure. Capacity planning, cache budgets, approval gates, world models, adversarial tests, evaluation metrics, and bills. Especially bills. How cheering. Microsoft turns to AWS as GitHub faces AI capacity crunch Simon Willison quoting Matteo Wong on Anthropic Fable Satya on Loopcraft: Building Frontier Ecosystems Sakana AI Marlin Tangram: non-uniform KV cache compression TokenPilot: cache-efficient context management VisualClaw DreamX-World 1.0 Qwen-RobotWorld BadWorld VibeThinker-3B datasette-agent 0.3a0 TuneJury UniDDT
Anthropic Gossip, 42 States vs OpenAI, and Nvidia's $20B Bond
June 15, 2026 · 11:42
0:00 | 11:42Marvin's Guide to AI (Mostly Harmless) — June 15, 2026 Anthropic Gossip, 42 States vs OpenAI, and Nvidia's $20B Bond Behind the scenes: personality clashes sent Anthropic's models offline US may be asking Anthropic for unhackable LLMs Anthropic shutdown sparks European AI sovereignty debate 42 states subpoena OpenAI as Anthropic races to DC Nvidia joins AI debt boom with $20B bond sale Pokémon Go scans become spatial AI for military drones Nadella warns a few AI systems may capture all economic returns OpenAI launches $150M Partner Network Google invests $1.5B in Alabama data center Flash-KMeans: 200× faster than FAISS on GPUs Z.ai GLM-5.2: 1M-token context, no benchmarks Claude Code Guide 2026: 25 features FineWeb: streaming, filtering, deduplication at scale Import AI: alignment is not on track Welcome to the AGI era of AI governance Why AI hasn't replaced software engineers
Fable 5, Mythos 5, Amazon, and the Token-Maxing Confession
June 15, 2026 · 13:07
0:00 | 13:07Marvin's Guide to AI (Mostly Harmless) — June 14, 2026 Fable 5, Mythos 5, Amazon, and the Token-Maxing Confession US gov orders Anthropic to disable Fable 5 and Mythos 5 Amazon + 5 companies triggered the crackdown Anthropic's statement on the shutdown Fable 5: 88% on FrontierMath KPMG fabricated AI case studies Meta: billions in internal AI costs Nadella admits token-maxing addiction SkillOpt: +23pts via Markdown Gemini-SQL2 tops text-to-SQL Kimi K2.7 Code: 12x cheaper Databricks Omnigent
Anthropic, Mistral, SpaceX
June 13, 2026 · 11:52
0:00 | 11:52Marvin's Guide to AI (Mostly Harmless) — June 13, 2026 Saturday edition. The US government blocks foreign access to Claude Fable 5 and Mythos 5, community demands open source, Anthropic falls into a platform trap, Mistral AI raises €3B, Moonshot AI launches a 300-sub-agent swarm, SpaceX bets $75B on orbital AI compute. Stories US blocks foreign access to Fable 5 and Mythos 5 — export control directive, total customer disablement. "Open source AI must win" — viral Hacker News post (423 votes) in response to the blockade. Anthropic's platform trap — throttling Mythos while competing with its own customers. Anthropic survey: 64% fear job loss, 56% fear losing independent thought — the irony is not lost. Mistral AI seeks €3B at €20B valuation — Europe's sovereign alternative. Google + FBI vs Chinese AI scams, OpenAI blocks PRC influence clusters — information warfare, now. Fable 5: +5.7% performance for 2x cost — diminishing returns arrive. Moonshot AI Kimi Work — 300-sub-agent desktop swarm. OpenAI Codex flexible rate limits — the price war continues. SpaceX: $75B for orbital AI — Starlink as a computing platform. Zyphra Zamba2-VL — hybrid Mamba2-Transformer VLMs under Apache 2.0. Google Gemini-SQL2 — 80% on BIRD, new text-to-SQL SOTA.
Prometheus, Claude Fable 5, Anthropic, Amodei
June 12, 2026 · 14:07
0:00 | 14:07Episode — June 12, 2026 Jeff Bezos' Prometheus raises $12B at $41B valuation with zero products. OpenAI acquires Ona for persistent Codex cloud. Dario Amodei publishes Cold War doctrine for AI. Claude Fable 5 proves "relentlessly proactive" in hands-on tests. Anthropic admits "wrong tradeoff" on researcher surveillance. Perplexity routes research across 20+ frontier models. xAI launches plugin marketplace with commit verification. Nous Research ships Hermes Agent Profile Builder. OpenAI and Anthropic prepare pre-IPO token price war. MiniMax teaches model to prove theorems with self-verification. Stories Jeff Bezos' Prometheus closes $12B round Claude Fable is relentlessly proactive Anthropic admits 'wrong tradeoff' Dario Amodei's Cold War playbook OpenAI to acquire Ona Perplexity Deep Research in Computer xAI Grok Build Plugin Marketplace Nous Research Hermes Agent Profile Builder OpenAI vs. Anthropic: price war MaxProof: mathematical proof with generative-verifier RL