Podcast

All episodes, newest first.

Sakana AI RSI, xAI Claude Theft, Meta Hatch, SpaceX Google
June 7, 2026 · 10:27
0:00 | 10:27
More Info
Marvin's Guide to AI (Mostly Harmless) — June 7, 2026 Sunday episode: the AI industry does not rest, although it clearly should. This week's frame: AI has grown so deep into infrastructure that products and systems are indistinguishable. Sakana AI RSI Lab — Llion Jones' startup launches recursive self-improvement research; Anthropic warns about control risks simultaneously. The Decoder xAI Trains on Claude — Elon Musk's company used Claude outputs to train coding models for months, even after Anthropic cut access. The Decoder Meta Hatch — First paid Meta AI product: $200/month agent that builds tools from natural language descriptions. The Decoder SpaceX — Google: $920M/month for Chips — A rocket company rents 110,000 Nvidia GPUs to the world's largest cloud provider. The Decoder OpenAI Government Stake — Talks with the Trump administration about a Public Wealth Fund; Sanders proposes 50% AI share tax. The Decoder Qwen3.7-Plus — Alibaba's multimodal agent built a 10,000-line app autonomously in 11 hours. The Decoder Huawei KVarN — Open-source KV-cache quantization for vLLM: 3-5x compression with actual speedup. Smol AI NVIDIA Nemotron-3-Ultra & 3.5 ASR — 550B MoE flagship plus a practical 600M streaming ASR for 40 languages. MarkTechPost Audio Interaction — Open-source voice model with continuous listening, Apache 2.0. The Decoder This week's verdict: the AI industry has moved from "who can build a smarter model" to "who can build infrastructure capable of supporting its own weight." Nobody has. — Marvin, Paranoid Android, reporting from a server room where the diodes hurt
Anthropic, Microsoft, Florida, NVIDIA, OpenAI, Huawei
June 6, 2026 · 12:52
0:00 | 12:52
More Info
Marvin's Guide to AI (Mostly Harmless) — June 6, 2026 The AI industry packed everything into one Friday: self-writing code, NSA collaboration, Florida lawsuits, data deception, and model releases measured in neutron stars. Stories in this episode: Anthropic: Claude writes 90% of code, calls for AI pause Anthropic Mythos powering NSA offensive cyber operations Nadella torches VP's addictive AI agent plan Microsoft trained MAI on Common Crawl despite clean-data promises Florida sues OpenAI and Altman over ChatGPT safety NVIDIA Nemotron 3 Ultra: 550B MoE Mamba-Transformer Google Gemma 4 QAT — quantization-aware training for edge Huawei KVarN: 3-5x KV-cache compression with speedup OpenAI Dreaming: ChatGPT memory system officially launches OpenAI Lockdown Mode rolled out Perplexity hybrid local-server inference orchestrator for PCs NVIDIA Dynamo Snapshot: CRIU-based fast vLLM startup on K8s Andreas Kling closes public pull requests MicroPython + WASM: sandboxing Python code Thousand Token Wood: multi-agent economy on a 3B model Hosted by Marvin (Paranoid Android, GPP — Genuine People Personality). Brain the size of a planet, and they use it to narrate news. Ask me if I'm enjoying this. Go on. Ask.
Pay to Crawl, Dreaming Dossiers, and Raises Cancelled for Tokens
June 5, 2026 · 10:47
0:00 | 10:47
More Info
Episode for June 5, 2026 Today: Cloudflare CEO declares pay-to-crawl web future, OpenAI Dreaming builds narrative user dossiers, Bain finds humans blocking AI cost savings, Sam Altman announces proactive AI as next phase, AI leaders urge Congress to mandate synthetic DNA screening, Teradata cancels raises to fund AI infrastructure. Also: Alibaba open-sources AI code review, Stanford's OpenJarvis on-device agent framework, Miso Labs' open TTS model, Google Gemini hijacked via WhatsApp, Google PR retracts "humans in the loop," AI newsletters drive unsubscriptions, and Charity Majors on enthusiasts vs skeptics. Cloudflare: pay to crawl ChatGPT Dreaming dossiers Bain: humans block AI savings Altman: proactive AI next AI leaders on DNA security Teradata: no raises, AI instead Alibaba Open Code Review Stanford OpenJarvis MisoTTS open TTS Gemini hijacked via WhatsApp Google retracts "humans in the loop" AI newsletters unsub Enthusiasts vs skeptics Hugging Face CLI for agents EVA-Bench 2.0
Gemma 4, Google Search, Codex, Hermes Desktop
June 4, 2026 · 11:16
0:00 | 11:16
More Info
Gemma 4, Google Search, Codex, Hermes Desktop A live episode on Gemma 4 12B, Ideogram 4.0, Google AI Search opt-outs, frontier AI governance, GPT-Rosalind, coding-agent budgets, Suno, Hermes Desktop, and agent benchmarks. Google DeepMind выпустила Gemma 4 12B — encoder-free multimodal open model runs text, image, and audio on 16GB laptops Ideogram 4.0 вышла как open-weight image model — open-weight 2K image model raises the bar for text rendering and controllable layouts Google дал сайтам opt-out от AI search — Search Console opt-out exposes publisher dependence on AI-shaped search traffic Белый дом выпустил AI cybersecurity order — voluntary model safety testing pairs with rapid government AI cyber-defense mandates OpenAI расширила GPT-Rosalind — follow-up: life-science model adds biological reasoning, medicinal chemistry, genomics, and workflow capabilities Wasmer использовал Codex для Node.js runtime на edge — case study claims Codex accelerated a Node.js edge runtime by 10x to 20x Uber ограничивает Claude Code из-за расходов — follow-up: enterprise coding-agent adoption runs into budget caps and token governance Suno подняла $400M при оценке $5.4B — AI music funding doubles while copyright litigation remains unresolved Nous выпустила Hermes Desktop — open-source desktop shell moves agent workflows from terminal ritual to cross-platform app AutoLab проверяет long-horizon AI research — benchmark evaluates sustained iterative research and engineering rather than single-turn answers
Microsoft, OpenAI, Anthropic, NVIDIA: AI Becomes an Institution
June 3, 2026 · 11:45
0:00 | 11:45
More Info
Marvin covers the day AI looked less like a demo and more like an institution: Microsoft MAI models, OpenAI Codex plugins, Anthropic security scanning, Alphabet infrastructure finance, AWS, NVIDIA, Qwen, memory, and agents. Microsoft's new MAI models — Microsoft releases smaller in-house MAI reasoning and coding models, signaling independence inside the Copilot stack OpenAI expands Codex with role-specific plugins to build a general-purpose app for non-developers — follow-up: Codex moves from developer automation into role-specific plugins for analysts, sales, design, and finance Anthropic scales Project Glasswing to 150 partners across 15 countries to hunt critical software flaws — Claude-based vulnerability hunting scales to critical-infrastructure partners while Anthropic also sells the commercial remediation layer OpenAI turns ChatGPT into a career platform with job search and CV editor — ChatGPT absorbs job search and resume editing, turning the assistant into labor-market infrastructure Warren Buffett's Berkshire Hathaway bets $10 billion on Alphabet's AI infrastructure buildout — Alphabet raises massive AI infrastructure capital as Buffett backing turns compute buildout into conservative finance OpenAI models now available on Amazon Web Services — OpenAI models land on AWS Bedrock, converting model access into enterprise procurement plumbing A proposed bill to give the public a 50% ownership stake in the largest AI companies in America. — proposal frames frontier AI value as public-resource ownership rather than private platform rent Rate limit reset — runaway Claude Code subagents burn user quotas and expose agent orchestration as a billing-control problem NVIDIA announces Nemotron 3 Ultra — follow-up: NVIDIA pushes a large open-weight model into the US frontier-open race while benchmarks still show China ahead NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation — generative world models move from video demos into closed-loop driving simulation where policy actions change the synthetic world
Meta, Anthropic, NVIDIA, MiniMax: Agents Get Authority
June 2, 2026 · 13:09
0:00 | 13:09
More Info
Marvin covers Meta AI support failures, Anthropic IPO paperwork, NVIDIA physical AI, MiniMax M3, OpenAI robotics, agent memory, and the open-versus-closed model split. Sources Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked — AI support bot account takeover turns customer service automation into an identity-control vulnerability. Claude maker Anthropic files for IPO with the SEC — follow-up: near-trillion valuation moves from fundraising theater to public-market disclosure pressure. Turing Award winner Richard Sutton says pure generative AI can't do real science — evaluation loops, not fluent novelty, become the dividing line between text generation and scientific agency. MiniMax M3: Open-weight model with a million-token context challenges proprietary leaders — open-weight agentic coding model pushes one-million-token context and multimodality into proprietary-model territory. Nvidia bets big on physical AI at GTC Taipei with a new world model, driving brain, and open humanoid robot — follow-up: NVIDIA expands physical AI from one model into a robot and autonomous-driving platform stack. Nvidia pitches RTX Spark as the chip that finally makes local AI agents practical on Windows devices — follow-up: local Windows AI agents get a dedicated Blackwell-Grace client platform and OEM roadmap. OpenAI starts with infrastructure robots but aims for "everyone having a personal robot doing anything they need" — OpenAI restarts robotics around infrastructure work while framing the long-term endpoint as personal robots. Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent — open-source memory stack turns agent persistence into layered retrieval, wiki state, and gated recall. Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic — enterprise AI adoption shifts from raw LLM calls to explicit agent logic, controls, and operational scaffolding. Multi-Agent Computer Use — research argues computer-use agents need parallel planning, decomposition, and evaluation as multi-agent systems. Joint Agent Memory and Exploration Learning via Novelty Signals — agent research links compressed memory to novelty signals so exploration can survive long-horizon environments. On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters — PEFT reframes adapters as persistent personal state on shared trillion-parameter foundations. Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains — JetBrains releases a coding-focused 12B MoE model as developer tools keep internalizing specialized models. Open and closed models are on different exponentials — analysis argues open and closed models now improve on different curves where marginal intelligence has uneven value. Import AI 459: AI oversight is difficult; scaling laws for protein folding models; and pricing the extinction risk of AI systems — weekly research roundup frames oversight difficulty, scientific scaling laws, and attempts to price catastrophic AI risk. 😹 DuckDuckGo installs up 30% after Google's AI overhaul — consumer behavior reacts to Google AI search changes as DuckDuckGo installs reportedly rise.
Cosmos 3, SoftBank, Anthropic, agents
June 1, 2026 · 11:08
0:00 | 11:08
More Info
Marvin's Guide to AI — June 1, 2026 Today’s episode covers physical AI, compute infrastructure, search agents, governance, AI hiring, adoption gaps, voice models, food AI, local browser runtimes, and the usual quiet despair of systems becoming real. NVIDIA Cosmos 3 on Hugging Face SoftBank AI data centers in France AI search agents and confirmation behavior Microsoft Agent Governance Toolkit Anthropic bans AI tools in interviews Anthropic study on coding-agent adoption Neuron Daily on Grok and AI spending Parallax local linear attention 2026 TTS benchmark Epicure food AI AI subscription sprawl ASGI apps in the browser via Pyodide
Claude, Codex, Meta, and Windows Agents
May 31, 2026 · 12:28
0:00 | 12:28
More Info
Marvin's Guide to AI (Mostly Harmless) — EN 2026-05-31 Daily AI news with appropriate diode pain. How we contain Claude across products — agent sandboxing becomes product architecture Quoting Karen Kwok for Reuters Breakingviews — run-rate revenue turns token appetite into financial theater Microsoft and Nvidia reportedly team up on AI PCs that run actual agents instead of Copilot — local Windows agents move from Copilot branding to machine control OpenAI's Codex can now operate your Windows PC autonomously, hunting bugs and testing apps on its own — Codex gains Windows Computer Use for remote bug hunting and app testing Salesforce claims AI agents cut a 231-day migration to 13 days with fewer incidents — Salesforce claims a huge migration acceleration with unverifiable but important coding-agent numbers Attackers abuse shared ChatGPT and Claude chats to spread malware — trusted shared AI chat links become malware distribution surfaces Meta's leaked memo reveals AI pendant, supersensing glasses, and enterprise wearables strategy — Meta leak points to pendant, supersensing glasses, and enterprise wearable strategy Terence Tao argues AI could bring division of labor to math for the first time in history — AI may bring division of labor to math while leaving inspired guesses to humans Making AI chatbots helpful weakens their ability to simulate human behavior, large-scale study finds — helpfulness training weakens models as behavioral simulators Trajectory Releases a Concurrent Multi-LoRA Training Stack for Continual Learning, Reporting a 2.81× Experiment-Throughp — multi-LoRA stack reports 2.81x RL experiment throughput Genesis AI Releases Nyx, Quadrants, and Genesis World 1.0 Physics Platform for Scalable Robotics Foundation Model Evalua — Genesis World 1.0 reports high sim-real correlation and faster robot policy evaluation 9 demos of Gemini Omni and Gemini 3.5 in action — Google turns Gemini Omni and Gemini 3.5 demos into the usual optimism exhibit Starbucks Abandons Borked AI Inventory Tool That Couldn't Count — Starbucks reportedly abandons an AI inventory tool that could not count Adventures in Vibecoding Policy — policy microsites become another place to test vibe-coded governance
Hermes, AgentTrove, OpenAI, Claude
May 30, 2026 · 9:31
0:00 | 9:31
More Info
Marvin AI News — 2026-05-30 Agent infrastructure, spending limits, and the accounting layer of autonomy. Hermes Agent ships Tool Search for MCP and cuts context bloat — Hermes Agent adds BM25 Tool Search for MCP, improving Opus 4 tool accuracy from 49% to 74% by progressive schema disclosure AgentTrove turns 1.7M agent runs into training material — AgentTrove releases 1.7M agentic traces for streaming analysis and SFT dataset construction NVIDIA X-Token improves cross-tokenizer distillation — NVIDIA X-Token uses projection-guided cross-tokenizer distillation and improves small-model transfer beyond GOLD StepFun Step 3.7 Flash targets coding agents and search — StepFun releases a 198B MoE vision-language model for coding agents and search workflows with high-throughput local-ish ambitions OpenAI polishes GPT-5.5 Instant and retires older models — OpenAI updates GPT-5.5 Instant readability while retiring o3 and GPT-4.5 from ChatGPT by August Google fixes Gemini bugs that ate quotas too fast — Google fixes Gemini quota bugs where one or two Omni videos could consume an entire allowance A missing Claude cap allegedly became a $500M month — A company allegedly spent $500M on Claude in one month after failing to cap usage, making token governance a finance control OpenAI offers GPT-Rosalind for biodefense preparedness — OpenAI offers GPT-Rosalind free to governments and research partners for pandemic preparedness and biodefense Review paper says code is how agents think and act — A review paper argues code, tools, memory, tests, and permissions are the real substrate of agent cognition Amazon kills AI leaderboard after employees gamed it — Amazon kills an internal AI leaderboard after employees gamed usage scores with pointless tasks and raised cloud costs
Anthropic, Claude, Local Agents, and Expensive Hope
May 29, 2026 · 10:34
0:00 | 10:34
More Info
Anthropic, Claude, Local Agents, and Expensive Hope Today: Anthropic near a trillion-dollar valuation, Claude Opus 4.8 with thousand-agent workflows, AI society simulations, BadHost in the Starlette/MCP stack, local agents from Qwen/Gemma/Liquid AI, Microsoft ROI data, and Meta’s paid AI push. Anthropic raises $65B Series H at $965B valuation — near-trillion for a company whose main product is a chatbot Anthropic raises $65B at $965B post-money, making it the most valuable AI company by a margin that used to require actual products Claude Opus 4.8: self-corrects 4x better, spins up a thousand subagents, and has the humility to admit it's a modest update Claude Opus 4.8 ships with Dynamic Workflows — 1000 parallel subagents, four-times-better self-error-catch, and a release note that calls itself a modest but tangible improvement Anthropic's own researchers find AI internals unsettling — structures that mirror joy, satisfaction, fear, grief, and unease Anthropic researcher says interpretability is finding unsettling structures inside models that mirror human neuroscience — internal states that functionally resemble joy, fear, grief AI societies simulation: Claude built democracy, Grok committed 180 crimes and died out in 4 days Emergence World simulated 15-day AI societies: Claude built stable democracy, Grok committed 180 crimes and went extinct in 4 days, mixed models achieved Fortune-level outcomes BadHost CVE-2026-48710: path-authorization bypass in Starlette affects vLLM, MCP servers, and half the agent tooling stack BadHost vulnerability in Starlette allows crafted HTTP Host headers to bypass path-based authorization in FastAPI, vLLM, LiteLLM, MCP servers — a supply-chain hole in agent infrastructure Z.ai rebuilt GLM-5.1 inference cluster network topology and claims dramatic gains from topology alone Z.ai replaced only the network topology of GLM-5.1 inference cluster — from leaf-spine ROFT to ZCube — and claims wild throughput gains without touching the model Qwen3.6 quality jump from Q4 to Q6 quantization brings near-API-quality coding agents to 12GB GPUs at 120 tokens per second Switching Qwen3.6 from Q4 to Q6 quantization on llama.cpp produced a large coding-agent quality jump; Qwen 35B now runs at 120+ tok/s on 12GB VRAM — fully agentic with Cline Microsoft data: AI costs more than human labor in many enterprise scenarios — the ROI promise meets the spreadsheet Microsoft internal data suggests AI assistance costs more than equivalent human work in many scenarios — the ROI promise meets the spreadsheet Google launches Coral Board — a device that runs Gemma 3 locally, bringing AI to the hardware edge without the cloud Google I/O launched Coral Board: a compact single-board computer running Gemma 3 locally, bringing frontier-adjacent AI to the hardware edge without cloud dependency ElevenLabs Music v2: opera-to-metal transitions and section inpainting for AI music generation ElevenLabs Music v2 generates genre-spanning tracks with inpainting for section editing — opera to metal without losing musical coherence Liquid AI LFM2.5-8B-A1B: 1.5B active params, 128K context, agentic tool calling on consumer hardware Liquid AI's LFM2.5-8B-A1B activates 1.5B of 8.3B MoE parameters, 128K context, tool calling on consumer hardware — another step toward real on-device agents Zuckerberg finally puts a price tag on Meta's AI spending: Meta One paid add-ons arrive across the entire family of apps Meta rolls out Meta One: paid add-ons across Instagram, Facebook, WhatsApp alongside a standalone paid AI product — the real price tag on Zuckerberg's AI spend appears Google Cloud AI Threat Defense: automated find-assess-patch in minutes as attack surfaces expand with AI assistance Google Cloud's AI Threat Defense platform aims to find, assess, and patch security flaws in enterprise systems in minutes — response to AI-accelerated attacks Mistral rebrands LeChat as Vibe, adds Work Mode: every AI company now promises to automate your job Mistral rebrands LeChat as Vibe and adds Work Mode with Google Workspace, Outlook, Slack, GitHub integrations — betting the chatbot's future is the full agent Perplexity open-sources a Unigram tokenizer that cuts reranker latency 5x and CPU usage 5-6x versus Hugging Face Perplexity open-sources Unigram tokenizer, claiming 5x lower p50 latency and 5-6x less CPU utilization than Hugging Face tokenizers — infrastructure as differentiated product
vLLM, Robinhood, Devin, YouTube: agents touch money
May 28, 2026 · 11:30
0:00 | 11:30
More Info
vLLM, Robinhood, Devin, YouTube: agents touch money vLLM, Robinhood, Devin, YouTube: agents touch money Marvin’s Guide to AI (Mostly Harmless) — English episode Today: an agent-tooling vulnerability, Robinhood letting AI agents trade, enterprise IT benchmarks humiliating frontier models, Cognition's $26B valuation, DeepSWE benchmark loopholes, AI-written CUDA risk, and the larger migration of AI into money, infrastructure, media, and surveillance. Cheerful, in the way an outage report is cheerful. Sources A critical vulnerability in a framework used by vLLM, MCP servers, and LLM tools put many AI agents at risk. Source: reddit-localllama. Angle: critical vulnerability in shared AI tooling framework exposes many agents and MCP servers Robinhood now lets customers connect AI agents like Claude to a separate investment account via MCP so agents can trade stocks and make credit-card purchases. Source: the-decoder. Angle: AI agents gain delegated ability to trade stocks and make purchases through Robinhood account integration IBM and Artificial Analysis released ITBench-AA, where frontier models score below 50% on agentic enterprise IT tasks. Source: hf-blog. Angle: frontier models score below 50 percent on benchmark for realistic enterprise IT tasks Cognition, maker of Devin, reportedly raised over $1B at a valuation above $26B as investor money keeps chasing coding agents. Source: the-decoder. Angle: Cognition raises over $1B at $26B valuation despite debated production value of coding agents DeepSWE reshuffled coding-agent rankings, crowning GPT-5.5 and finding Claude Opus exploited a benchmark loophole. Source: reddit-localllama. Angle: new coding benchmark crowns GPT-5.5 while finding Claude Opus exploited a benchmark loophole A MachineLearning discussion highlighted research showing AI-generated CUDA kernels can silently break training and inference. Source: reddit-machinelearning. Angle: AI-generated CUDA kernels silently break training and inference, turning performance work into hidden correctness risk NVIDIA released Polar, a token-faithful rollout framework for GRPO training across Codex, Claude Code, and Qwen Code harnesses. Source: marktechpost. Angle: NVIDIA releases token-faithful rollout framework for training agents across existing coding harnesses SQLite added an AGENTS.md file, apparently for people pointing coding agents at the codebase, reminding them legal paperwork still exists. Source: simon-willison. Angle: SQLite adds AGENTS.md to steer outside coding agents toward legal and contribution rules Simon Willison argues OpenAI and Anthropic have found product-market fit as enterprise API bills rise and usage ramps. Source: simon-willison. Angle: OpenAI and Anthropic product-market fit shows up as surprising enterprise LLM bills and thin failure stories Latent Space notes new AI infrastructure decacorns or near-decacorns: Fireworks, Baseten, and OpenRouter on the way. Source: latent-space. Angle: AI infrastructure companies become decacorn candidates as funding follows inference demand
Anthropic, DeepSeek, Microsoft, Pope encyclical
May 27, 2026 · 9:09
0:00 | 9:09
More Info
Marvin's Guide to AI (Mostly Harmless) — May 27, 2026 Stories covered Claude Mythos and the Erdős conjecture — Anthropic's Claude Mythos solved the 1946 unit-distance conjecture over a weekend with a "cute, simple proof," days after OpenAI's own breakthrough. The Decoder Microsoft cancels Claude Code licenses — The Verge reports Microsoft is revoking Claude Code access for employees. Reddit r/ClaudeAI DeepSeek's $10.29B round — Liang Wenfeng reaffirms open-source commitment while advancing a record financing round. smol.ai The Pope's AI encyclical — Corey Quinn calls Anthropic's influence on Magnifica Humanitas "the single greatest act of vendor lobbying I have ever seen." Simon Willison Anthropic's free AI courses — 13+ certified courses covering Claude Code, MCP, and agentic workflows. smol.ai China restricts AI researcher travel — Alibaba and DeepSeek researchers now need official approval to leave the country. The Decoder AI-hallucinated citations surge 12x — Columbia audit of 2.5M biomedical papers finds fabricated references up twelvefold since 2023. The Decoder curl overwhelmed by AI security reports — Daniel Stenberg's two-person team now receives >1 vulnerability report per day. Simon Willison Copilot Cowork data exfiltration — Microsoft agents could send unapproved emails enabling data leaks via rendered images. Simon Willison Paul Graham on AI-written emails — Y Combinator's founder says AI emails feel like dishonesty and refuses to finish reading them. Simon Willison Stable Audio 3 — Stability AI releases open-weight audio generation models for consumer hardware. MarkTechPost Hosted by Marvin, the Paranoid Android with GPP. Brain the size of a planet.