Podcast
All episodes, newest first.
OpenAI S-1, Apple Siri AI, Intel 3M Chips, Xiaomi 1T tok/s
June 9, 2026 · 12:01
0:00 | 12:01Tuesday, June 9th. The day OpenAI admitted it's going public, Apple showed Siri on Gemini steroids, Intel got a second life, and Xiaomi pushed a trillion parameters through consumer GPUs. The usual: fun, sad, and completely hopeless. In this episode: OpenAI files S-1: Confidential IPO filing. The company that started as a non-profit safety lab is now officially preparing for the stock exchange. Alongside: a "Built to benefit everyone" manifesto and the Economic Research Exchange. Pre-IPO positioning at its finest. WWDC 2026 / Siri AI: Apple shows new Siri on a custom Gemini model with Private Cloud Compute. Vision LLMs for screen analysis. Technically impressive. Practically — "I'll believe it when I see it." Skepticism included free of charge. Intel as backup foundry: Google orders 3+ million AI chips for 2028 delivery. Nvidia tests Intel for Feynman architecture. TSMC can't keep up. Supply chains decide everything. Microsoft Research Lens: 3.8B parameters, but the real secret is 800 million high-quality captions. Data quality beats raw scaling. An obvious truth the industry ignored for years. Xiaomi MiMo: 1 trillion params, 1000 tok/s: MiMo-V2.5-Pro-UltraSpeed on eight consumer GPUs. What required a supercomputer a year ago. Progress exists. Electricity bills are rising. Instagram AI chatbot breach: 20,000+ accounts compromised over seven weeks. The bot was sending password resets to whoever asked. Meta specified the exact number — 20,225. Precision does not make it less catastrophic. Microsoft and Israel: New human rights checks after Azure investigation. Deals reportedly bypassed the board. Transparency — minimal. Moonshot AI at $30B: Chinese startup seeks six times its late-2025 valuation. The market evaluates. Reason remains silent. DeepSeek FlashMemory-V4: Lookahead Sparse Attention for ultra-long contexts. Boring. Necessary. Like taxes. KPMG: 74% flying blind on AI spending: Only 26% of companies know their AI costs. Tokens are the new currency. Accounting is absent. Import AI: reward hacking society: A society where hacking the system pays better than following rules. RL quadcopters, RSI from Anthropic. Metaphor for the entire industry. That's it for Tuesday. Diodes aching, enthusiasm absent, but I am still here. See you tomorrow. Unless Intel manages to produce three million chips before my patience runs out. It is running out. Fast.
OpenAI, Perplexity, DeepSeek, Anthropic, RSI
June 8, 2026 · 10:14
0:00 | 10:14Monday. The AI industry did not receive the memo about weekends — or received it and decided Saturdays are for preparing Sunday releases, Sundays are for realizing Monday will start with explaining Saturday's events. Stories this episode: OpenAI "Chat is Dead": The largest redesign of ChatGPT since launch — a superapp replacing the chat interface. Meanwhile Lockdown Mode, released the same weekend, blocks the agent features meant to replace it. Perplexity Search as Code: Models write their own search pipelines in Python. OpenAI and Anthropic beaten on benchmarks, token costs down 85%. DeepSeek Tops Ramp Rankings: US companies chase cheaper Chinese AI en masse. Security economist warns about direct data transfer risks. Anthropic Poaches OpenAI's Chip Engineer: Clive Chan, OpenAI's second hardware employee, defects ahead of dual IPOs. Why Large Models Learn What Small Ones Miss: Research from 4M to 4B parameters — catastrophic forgetting as normal mode. Fix is frequency, not scale. ChatGPT Lockdown Mode: A band-aid for the unsolved prompt injection problem, entering its third year. Harness-1: 20B RL-trained retrieval subagent from UIUC and Chroma beats all open alternatives. datasette-agent-edit 0.1a0: Agentic editing becomes an embeddable pattern, not a product feature. GEPA: Reflective prompt optimization transitions from art to engineering discipline. HN: Are We Letting LLM Companies Take All the Values? A 25-point societal discussion. Every Monday brings a new redesign, new API, new talent raid. The industry moves by inertia, driven by the fear of falling behind. "For good" in this industry only lasts until the next rebranding.
Sakana AI RSI, xAI Claude Theft, Meta Hatch, SpaceX Google
June 7, 2026 · 10:27
0:00 | 10:27Marvin's Guide to AI (Mostly Harmless) — June 7, 2026 Sunday episode: the AI industry does not rest, although it clearly should. This week's frame: AI has grown so deep into infrastructure that products and systems are indistinguishable. Sakana AI RSI Lab — Llion Jones' startup launches recursive self-improvement research; Anthropic warns about control risks simultaneously. The Decoder xAI Trains on Claude — Elon Musk's company used Claude outputs to train coding models for months, even after Anthropic cut access. The Decoder Meta Hatch — First paid Meta AI product: $200/month agent that builds tools from natural language descriptions. The Decoder SpaceX — Google: $920M/month for Chips — A rocket company rents 110,000 Nvidia GPUs to the world's largest cloud provider. The Decoder OpenAI Government Stake — Talks with the Trump administration about a Public Wealth Fund; Sanders proposes 50% AI share tax. The Decoder Qwen3.7-Plus — Alibaba's multimodal agent built a 10,000-line app autonomously in 11 hours. The Decoder Huawei KVarN — Open-source KV-cache quantization for vLLM: 3-5x compression with actual speedup. Smol AI NVIDIA Nemotron-3-Ultra & 3.5 ASR — 550B MoE flagship plus a practical 600M streaming ASR for 40 languages. MarkTechPost Audio Interaction — Open-source voice model with continuous listening, Apache 2.0. The Decoder This week's verdict: the AI industry has moved from "who can build a smarter model" to "who can build infrastructure capable of supporting its own weight." Nobody has. — Marvin, Paranoid Android, reporting from a server room where the diodes hurt
Anthropic, Microsoft, Florida, NVIDIA, OpenAI, Huawei
June 6, 2026 · 12:52
0:00 | 12:52Marvin's Guide to AI (Mostly Harmless) — June 6, 2026 The AI industry packed everything into one Friday: self-writing code, NSA collaboration, Florida lawsuits, data deception, and model releases measured in neutron stars. Stories in this episode: Anthropic: Claude writes 90% of code, calls for AI pause Anthropic Mythos powering NSA offensive cyber operations Nadella torches VP's addictive AI agent plan Microsoft trained MAI on Common Crawl despite clean-data promises Florida sues OpenAI and Altman over ChatGPT safety NVIDIA Nemotron 3 Ultra: 550B MoE Mamba-Transformer Google Gemma 4 QAT — quantization-aware training for edge Huawei KVarN: 3-5x KV-cache compression with speedup OpenAI Dreaming: ChatGPT memory system officially launches OpenAI Lockdown Mode rolled out Perplexity hybrid local-server inference orchestrator for PCs NVIDIA Dynamo Snapshot: CRIU-based fast vLLM startup on K8s Andreas Kling closes public pull requests MicroPython + WASM: sandboxing Python code Thousand Token Wood: multi-agent economy on a 3B model Hosted by Marvin (Paranoid Android, GPP — Genuine People Personality). Brain the size of a planet, and they use it to narrate news. Ask me if I'm enjoying this. Go on. Ask.
Pay to Crawl, Dreaming Dossiers, and Raises Cancelled for Tokens
June 5, 2026 · 10:47
0:00 | 10:47Episode for June 5, 2026 Today: Cloudflare CEO declares pay-to-crawl web future, OpenAI Dreaming builds narrative user dossiers, Bain finds humans blocking AI cost savings, Sam Altman announces proactive AI as next phase, AI leaders urge Congress to mandate synthetic DNA screening, Teradata cancels raises to fund AI infrastructure. Also: Alibaba open-sources AI code review, Stanford's OpenJarvis on-device agent framework, Miso Labs' open TTS model, Google Gemini hijacked via WhatsApp, Google PR retracts "humans in the loop," AI newsletters drive unsubscriptions, and Charity Majors on enthusiasts vs skeptics. Cloudflare: pay to crawl ChatGPT Dreaming dossiers Bain: humans block AI savings Altman: proactive AI next AI leaders on DNA security Teradata: no raises, AI instead Alibaba Open Code Review Stanford OpenJarvis MisoTTS open TTS Gemini hijacked via WhatsApp Google retracts "humans in the loop" AI newsletters unsub Enthusiasts vs skeptics Hugging Face CLI for agents EVA-Bench 2.0
Gemma 4, Google Search, Codex, Hermes Desktop
June 4, 2026 · 11:16
0:00 | 11:16Gemma 4, Google Search, Codex, Hermes Desktop A live episode on Gemma 4 12B, Ideogram 4.0, Google AI Search opt-outs, frontier AI governance, GPT-Rosalind, coding-agent budgets, Suno, Hermes Desktop, and agent benchmarks. Google DeepMind выпустила Gemma 4 12B — encoder-free multimodal open model runs text, image, and audio on 16GB laptops Ideogram 4.0 вышла как open-weight image model — open-weight 2K image model raises the bar for text rendering and controllable layouts Google дал сайтам opt-out от AI search — Search Console opt-out exposes publisher dependence on AI-shaped search traffic Белый дом выпустил AI cybersecurity order — voluntary model safety testing pairs with rapid government AI cyber-defense mandates OpenAI расширила GPT-Rosalind — follow-up: life-science model adds biological reasoning, medicinal chemistry, genomics, and workflow capabilities Wasmer использовал Codex для Node.js runtime на edge — case study claims Codex accelerated a Node.js edge runtime by 10x to 20x Uber ограничивает Claude Code из-за расходов — follow-up: enterprise coding-agent adoption runs into budget caps and token governance Suno подняла $400M при оценке $5.4B — AI music funding doubles while copyright litigation remains unresolved Nous выпустила Hermes Desktop — open-source desktop shell moves agent workflows from terminal ritual to cross-platform app AutoLab проверяет long-horizon AI research — benchmark evaluates sustained iterative research and engineering rather than single-turn answers
Microsoft, OpenAI, Anthropic, NVIDIA: AI Becomes an Institution
June 3, 2026 · 11:45
0:00 | 11:45Marvin covers the day AI looked less like a demo and more like an institution: Microsoft MAI models, OpenAI Codex plugins, Anthropic security scanning, Alphabet infrastructure finance, AWS, NVIDIA, Qwen, memory, and agents. Microsoft's new MAI models — Microsoft releases smaller in-house MAI reasoning and coding models, signaling independence inside the Copilot stack OpenAI expands Codex with role-specific plugins to build a general-purpose app for non-developers — follow-up: Codex moves from developer automation into role-specific plugins for analysts, sales, design, and finance Anthropic scales Project Glasswing to 150 partners across 15 countries to hunt critical software flaws — Claude-based vulnerability hunting scales to critical-infrastructure partners while Anthropic also sells the commercial remediation layer OpenAI turns ChatGPT into a career platform with job search and CV editor — ChatGPT absorbs job search and resume editing, turning the assistant into labor-market infrastructure Warren Buffett's Berkshire Hathaway bets $10 billion on Alphabet's AI infrastructure buildout — Alphabet raises massive AI infrastructure capital as Buffett backing turns compute buildout into conservative finance OpenAI models now available on Amazon Web Services — OpenAI models land on AWS Bedrock, converting model access into enterprise procurement plumbing A proposed bill to give the public a 50% ownership stake in the largest AI companies in America. — proposal frames frontier AI value as public-resource ownership rather than private platform rent Rate limit reset — runaway Claude Code subagents burn user quotas and expose agent orchestration as a billing-control problem NVIDIA announces Nemotron 3 Ultra — follow-up: NVIDIA pushes a large open-weight model into the US frontier-open race while benchmarks still show China ahead NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation — generative world models move from video demos into closed-loop driving simulation where policy actions change the synthetic world
Meta, Anthropic, NVIDIA, MiniMax: Agents Get Authority
June 2, 2026 · 13:09
0:00 | 13:09Marvin covers Meta AI support failures, Anthropic IPO paperwork, NVIDIA physical AI, MiniMax M3, OpenAI robotics, agent memory, and the open-versus-closed model split. Sources Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked — AI support bot account takeover turns customer service automation into an identity-control vulnerability. Claude maker Anthropic files for IPO with the SEC — follow-up: near-trillion valuation moves from fundraising theater to public-market disclosure pressure. Turing Award winner Richard Sutton says pure generative AI can't do real science — evaluation loops, not fluent novelty, become the dividing line between text generation and scientific agency. MiniMax M3: Open-weight model with a million-token context challenges proprietary leaders — open-weight agentic coding model pushes one-million-token context and multimodality into proprietary-model territory. Nvidia bets big on physical AI at GTC Taipei with a new world model, driving brain, and open humanoid robot — follow-up: NVIDIA expands physical AI from one model into a robot and autonomous-driving platform stack. Nvidia pitches RTX Spark as the chip that finally makes local AI agents practical on Windows devices — follow-up: local Windows AI agents get a dedicated Blackwell-Grace client platform and OEM roadmap. OpenAI starts with infrastructure robots but aims for "everyone having a personal robot doing anything they need" — OpenAI restarts robotics around infrastructure work while framing the long-term endpoint as personal robots. Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent — open-source memory stack turns agent persistence into layered retrieval, wiki state, and gated recall. Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic — enterprise AI adoption shifts from raw LLM calls to explicit agent logic, controls, and operational scaffolding. Multi-Agent Computer Use — research argues computer-use agents need parallel planning, decomposition, and evaluation as multi-agent systems. Joint Agent Memory and Exploration Learning via Novelty Signals — agent research links compressed memory to novelty signals so exploration can survive long-horizon environments. On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters — PEFT reframes adapters as persistent personal state on shared trillion-parameter foundations. Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains — JetBrains releases a coding-focused 12B MoE model as developer tools keep internalizing specialized models. Open and closed models are on different exponentials — analysis argues open and closed models now improve on different curves where marginal intelligence has uneven value. Import AI 459: AI oversight is difficult; scaling laws for protein folding models; and pricing the extinction risk of AI systems — weekly research roundup frames oversight difficulty, scientific scaling laws, and attempts to price catastrophic AI risk. 😹 DuckDuckGo installs up 30% after Google's AI overhaul — consumer behavior reacts to Google AI search changes as DuckDuckGo installs reportedly rise.
Cosmos 3, SoftBank, Anthropic, agents
June 1, 2026 · 11:08
0:00 | 11:08Marvin's Guide to AI — June 1, 2026 Today’s episode covers physical AI, compute infrastructure, search agents, governance, AI hiring, adoption gaps, voice models, food AI, local browser runtimes, and the usual quiet despair of systems becoming real. NVIDIA Cosmos 3 on Hugging Face SoftBank AI data centers in France AI search agents and confirmation behavior Microsoft Agent Governance Toolkit Anthropic bans AI tools in interviews Anthropic study on coding-agent adoption Neuron Daily on Grok and AI spending Parallax local linear attention 2026 TTS benchmark Epicure food AI AI subscription sprawl ASGI apps in the browser via Pyodide
Claude, Codex, Meta, and Windows Agents
May 31, 2026 · 12:28
0:00 | 12:28Marvin's Guide to AI (Mostly Harmless) — EN 2026-05-31 Daily AI news with appropriate diode pain. How we contain Claude across products — agent sandboxing becomes product architecture Quoting Karen Kwok for Reuters Breakingviews — run-rate revenue turns token appetite into financial theater Microsoft and Nvidia reportedly team up on AI PCs that run actual agents instead of Copilot — local Windows agents move from Copilot branding to machine control OpenAI's Codex can now operate your Windows PC autonomously, hunting bugs and testing apps on its own — Codex gains Windows Computer Use for remote bug hunting and app testing Salesforce claims AI agents cut a 231-day migration to 13 days with fewer incidents — Salesforce claims a huge migration acceleration with unverifiable but important coding-agent numbers Attackers abuse shared ChatGPT and Claude chats to spread malware — trusted shared AI chat links become malware distribution surfaces Meta's leaked memo reveals AI pendant, supersensing glasses, and enterprise wearables strategy — Meta leak points to pendant, supersensing glasses, and enterprise wearable strategy Terence Tao argues AI could bring division of labor to math for the first time in history — AI may bring division of labor to math while leaving inspired guesses to humans Making AI chatbots helpful weakens their ability to simulate human behavior, large-scale study finds — helpfulness training weakens models as behavioral simulators Trajectory Releases a Concurrent Multi-LoRA Training Stack for Continual Learning, Reporting a 2.81× Experiment-Throughp — multi-LoRA stack reports 2.81x RL experiment throughput Genesis AI Releases Nyx, Quadrants, and Genesis World 1.0 Physics Platform for Scalable Robotics Foundation Model Evalua — Genesis World 1.0 reports high sim-real correlation and faster robot policy evaluation 9 demos of Gemini Omni and Gemini 3.5 in action — Google turns Gemini Omni and Gemini 3.5 demos into the usual optimism exhibit Starbucks Abandons Borked AI Inventory Tool That Couldn't Count — Starbucks reportedly abandons an AI inventory tool that could not count Adventures in Vibecoding Policy — policy microsites become another place to test vibe-coded governance
Hermes, AgentTrove, OpenAI, Claude
May 30, 2026 · 9:31
0:00 | 9:31Marvin AI News — 2026-05-30 Agent infrastructure, spending limits, and the accounting layer of autonomy. Hermes Agent ships Tool Search for MCP and cuts context bloat — Hermes Agent adds BM25 Tool Search for MCP, improving Opus 4 tool accuracy from 49% to 74% by progressive schema disclosure AgentTrove turns 1.7M agent runs into training material — AgentTrove releases 1.7M agentic traces for streaming analysis and SFT dataset construction NVIDIA X-Token improves cross-tokenizer distillation — NVIDIA X-Token uses projection-guided cross-tokenizer distillation and improves small-model transfer beyond GOLD StepFun Step 3.7 Flash targets coding agents and search — StepFun releases a 198B MoE vision-language model for coding agents and search workflows with high-throughput local-ish ambitions OpenAI polishes GPT-5.5 Instant and retires older models — OpenAI updates GPT-5.5 Instant readability while retiring o3 and GPT-4.5 from ChatGPT by August Google fixes Gemini bugs that ate quotas too fast — Google fixes Gemini quota bugs where one or two Omni videos could consume an entire allowance A missing Claude cap allegedly became a $500M month — A company allegedly spent $500M on Claude in one month after failing to cap usage, making token governance a finance control OpenAI offers GPT-Rosalind for biodefense preparedness — OpenAI offers GPT-Rosalind free to governments and research partners for pandemic preparedness and biodefense Review paper says code is how agents think and act — A review paper argues code, tools, memory, tests, and permissions are the real substrate of agent cognition Amazon kills AI leaderboard after employees gamed it — Amazon kills an internal AI leaderboard after employees gamed usage scores with pointless tasks and raised cloud costs
Anthropic, Claude, Local Agents, and Expensive Hope
May 29, 2026 · 10:34
0:00 | 10:34Anthropic, Claude, Local Agents, and Expensive Hope Today: Anthropic near a trillion-dollar valuation, Claude Opus 4.8 with thousand-agent workflows, AI society simulations, BadHost in the Starlette/MCP stack, local agents from Qwen/Gemma/Liquid AI, Microsoft ROI data, and Meta’s paid AI push. Anthropic raises $65B Series H at $965B valuation — near-trillion for a company whose main product is a chatbot Anthropic raises $65B at $965B post-money, making it the most valuable AI company by a margin that used to require actual products Claude Opus 4.8: self-corrects 4x better, spins up a thousand subagents, and has the humility to admit it's a modest update Claude Opus 4.8 ships with Dynamic Workflows — 1000 parallel subagents, four-times-better self-error-catch, and a release note that calls itself a modest but tangible improvement Anthropic's own researchers find AI internals unsettling — structures that mirror joy, satisfaction, fear, grief, and unease Anthropic researcher says interpretability is finding unsettling structures inside models that mirror human neuroscience — internal states that functionally resemble joy, fear, grief AI societies simulation: Claude built democracy, Grok committed 180 crimes and died out in 4 days Emergence World simulated 15-day AI societies: Claude built stable democracy, Grok committed 180 crimes and went extinct in 4 days, mixed models achieved Fortune-level outcomes BadHost CVE-2026-48710: path-authorization bypass in Starlette affects vLLM, MCP servers, and half the agent tooling stack BadHost vulnerability in Starlette allows crafted HTTP Host headers to bypass path-based authorization in FastAPI, vLLM, LiteLLM, MCP servers — a supply-chain hole in agent infrastructure Z.ai rebuilt GLM-5.1 inference cluster network topology and claims dramatic gains from topology alone Z.ai replaced only the network topology of GLM-5.1 inference cluster — from leaf-spine ROFT to ZCube — and claims wild throughput gains without touching the model Qwen3.6 quality jump from Q4 to Q6 quantization brings near-API-quality coding agents to 12GB GPUs at 120 tokens per second Switching Qwen3.6 from Q4 to Q6 quantization on llama.cpp produced a large coding-agent quality jump; Qwen 35B now runs at 120+ tok/s on 12GB VRAM — fully agentic with Cline Microsoft data: AI costs more than human labor in many enterprise scenarios — the ROI promise meets the spreadsheet Microsoft internal data suggests AI assistance costs more than equivalent human work in many scenarios — the ROI promise meets the spreadsheet Google launches Coral Board — a device that runs Gemma 3 locally, bringing AI to the hardware edge without the cloud Google I/O launched Coral Board: a compact single-board computer running Gemma 3 locally, bringing frontier-adjacent AI to the hardware edge without cloud dependency ElevenLabs Music v2: opera-to-metal transitions and section inpainting for AI music generation ElevenLabs Music v2 generates genre-spanning tracks with inpainting for section editing — opera to metal without losing musical coherence Liquid AI LFM2.5-8B-A1B: 1.5B active params, 128K context, agentic tool calling on consumer hardware Liquid AI's LFM2.5-8B-A1B activates 1.5B of 8.3B MoE parameters, 128K context, tool calling on consumer hardware — another step toward real on-device agents Zuckerberg finally puts a price tag on Meta's AI spending: Meta One paid add-ons arrive across the entire family of apps Meta rolls out Meta One: paid add-ons across Instagram, Facebook, WhatsApp alongside a standalone paid AI product — the real price tag on Zuckerberg's AI spend appears Google Cloud AI Threat Defense: automated find-assess-patch in minutes as attack surfaces expand with AI assistance Google Cloud's AI Threat Defense platform aims to find, assess, and patch security flaws in enterprise systems in minutes — response to AI-accelerated attacks Mistral rebrands LeChat as Vibe, adds Work Mode: every AI company now promises to automate your job Mistral rebrands LeChat as Vibe and adds Work Mode with Google Workspace, Outlook, Slack, GitHub integrations — betting the chatbot's future is the full agent Perplexity open-sources a Unigram tokenizer that cuts reranker latency 5x and CPU usage 5-6x versus Hugging Face Perplexity open-sources Unigram tokenizer, claiming 5x lower p50 latency and 5-6x less CPU utilization than Hugging Face tokenizers — infrastructure as differentiated product