Imported blog post

Dima Kramskoy — Senior Cloud Architect at DoiT International 20+ years software engineering · 10 AWS certifications · AWS Community Builder 2026 · Alumni of Juval Löwy's Architect Master Class (2022)

The Problem: Every Conversation Starts from Zero

Here's the anti-pattern that drives me crazy: you have a 45-minute deep conversation with an AI assistant about your architecture decisions, trade-offs, constraints. You close the tab. Next day, you open a new chat. It knows nothing. You're back to onboarding.

Now multiply that across your entire knowledge surface: Slack threads with context that'll never be searchable again, email chains where decisions got buried in reply #14, docs that exist in three versions across two drives, and tribal knowledge that lives exclusively in people's heads.

No one would like to onboard their replacement with only their Slack history. But that's effectively what we're doing with our AI tools — giving them fragments and expecting coherence.

The core issue isn't intelligence. GPT-4, Claude, Gemini — they're all brilliant. The issue is amnesia. Every session is a cold start. Your context is scattered across a dozen tools, and none of them talk to each other through a unified model of you and your work.

I've been thinking about this for months. Then Karpathy posted a tweet that crystallized the solution.

Karpathy's Inspiration: RAG Retrieves, a Wiki Compounds

In April 2026, Andrej Karpathy posted about "the LLM Wiki" on X — 17 million views and counting. The core insight hit me like a design pattern clicking into place:

"RAG retrieves. A wiki compounds."

His framework is elegant. Three layers:

Raw Sources — articles, papers, tweets, conversations, notes
Wiki — distilled, structured, interconnected knowledge pages
Schema — the ontology that governs how knowledge gets organized

The analogy he uses is perfect: "Obsidian is the IDE. The LLM is the programmer. The wiki is the codebase."

Matt Paige wrote an excellent breakdown of the concept. The key shift: instead of dumping raw documents into a vector database and hoping retrieval finds the right chunk, you build a living, structured knowledge base that the LLM maintains and enriches over time.

RAG is a lookup. A wiki is a system. One is O(1) per query. The other compounds.

But here's what nagged me about Karpathy's version: it's manual. You're the operator. You prompt, you review, you paste, you organize. The LLM assists, but you drive. I wanted something different — an assistant that does this for me, not a system I maintain myself.

Why Amazon Quick Desktop

When I evaluated tools for this implementation, I had a checklist derived directly from Karpathy's architecture:

✅ Long-term memory that persists across conversations
✅ Knowledge graph (auto-extracts entities/relationships from Slack, email, calendar)
✅ Semantic search over local files
✅ File read/write access to my filesystem
✅ Background tasks (research while I work on other things)
✅ Connected tools (Gmail, Slack, Google Calendar, MCP servers)
✅ Action layer (can draft emails, create docs, book meetings)

Amazon Quick Desktop already had what Karpathy describes — but built-in and connected to real work tools. It's not a chat window. It's a runtime.

Quick comparison:

Capability	Claude Desktop	Glean	Amazon Quick
Long-term memory	❌ (per-project only)	❌	✅ Cross-conversation
Knowledge graph	❌	Partial (enterprise)	✅ Auto-extracted
File system access	✅ (MCP)	❌	✅ Native
Connected tools	Limited MCP	Enterprise SSO	✅ Slack/Gmail/Cal/etc.
Action layer	File writes only	Search only	✅ Full (drafts/slides/scheduling)
Background tasks	❌	❌	✅ Parallel agents

The key decision: I wanted an assistant that compounds knowledge FOR me, not a system I maintain myself. The maintenance tax of a manual wiki kills adoption. I've seen it a dozen times.

What I Built (in 1 Week)

Folder Structure

~/SecondBrain/
├── raw/                          # Unprocessed inputs
│   ├── articles/
│   ├── transcripts/
│   └── captures/
├── wiki/                         # Distilled knowledge
│   ├── concepts/                 # Mental models, frameworks
│   │   ├── second-brain-architecture.md
│   │   ├── rag-vs-wiki-pattern.md
│   │   └── approval-workflow-pattern.md
│   ├── entities/                 # People, orgs, tools
│   │   ├── doit-international.md
│   │   └── amazon-quick.md
│   ├── projects/                 # Active work
│   │   ├── genai-skill-share-talk.md
│   │   └── voice-capture-pipeline.md
│   ├── sources/                  # Indexed references
│   │   └── karpathy-llm-wiki.md
│   └── log/                      # Daily output tracking
│       ├── 2026-05-19.md
│       ├── 2026-05-20.md
│       └── ...
├── SCHEMA.md                     # The ontology
└── mkdocs.yml                    # Auto-served documentation

SCHEMA.md (Excerpt)

# SecondBrain Schema v1.0

## Page Types
- **concept**: A mental model, pattern, or framework. Must include: definition, when-to-use, anti-patterns, related concepts.
- **entity**: A person, org, or tool. Must include: role/purpose, relationship to my work, last interaction date.
- **project**: Active or completed work. Must include: status, stakeholders, decisions log, next actions.
- **source**: An ingested article/talk/paper. Must include: URL, key insights (max 5), connection to existing concepts.
- **log**: Daily entry. Auto-generated. Tracks: pages created, pages updated, questions answered, actions taken.

## Naming Convention
kebab-case. No dates in filenames (use frontmatter).

## Cross-Reference Rules
Every new page MUST link to ≥1 existing page. Orphans are a smell.

The Stack

MkDocs + Material theme — auto-serves the wiki locally on localhost:8000
launchd plist — starts MkDocs on login (macOS). Zero friction to browse.
Amazon Quick — reads/writes the wiki, proposes updates, answers questions against it
Semantic indexing — Quick indexes ~/SecondBrain/ and searches it contextually

The Approval Workflow

This is critical. Nothing writes without my OK. The flow:

I say "ingest this article" or share a link
Quick reads it, proposes a new wiki page (or updates to existing ones)
I see the diff — new content highlighted, cross-references shown
I approve, modify, or reject
Approved content writes to disk, MkDocs auto-refreshes

15+ wiki pages in the first week. Not from grinding — from conversations I was already having.

A Day in the Life

Morning:

"Good morning"

Quick responds with: priority emails (flagged or from key people), Slack threads that need my response, today's calendar with prep notes for meetings. Not a firehose — a briefing.

During the day:

"Ingest this: [link to architecture blog post]"

Quick reads it, identifies 3 key concepts, proposes a new sources/ page and updates to two existing concepts/ pages. I skim the diff, approve, done. 90 seconds.

"Draft a blog post about the Second Brain implementation"

It pulls from my wiki pages, knows my voice (from memory of past writing), structures it with my preferred format. I edit, not author from scratch.

"Block 2 hours for deep work on the voice capture pipeline tomorrow"

Checks my calendar, finds a slot, books it, adds prep notes from the projects/voice-capture-pipeline.md page.

Background:

While I'm in meetings, background tasks research topics I queued earlier. When I come back: "I found 3 relevant papers on knowledge graph maintenance. Want me to ingest them?"

The compound effect is real. By day 5, it was answering questions by synthesizing across multiple wiki pages I'd forgotten I approved.

The Voice Capture Extension (PoC)

Best ideas come when I'm walking, not at my desk. So I built a pipeline:

Architecture

iPhone Shortcuts (or Plaud NotePin wearable)
    → API Gateway (REST)
        → Lambda (upload handler)
            → S3 (audio bucket)
                → S3 Event → Lambda (transcription trigger)
                    → AWS Transcribe
                        → Lambda (post-processing)
                            → Amazon Quick Knowledge Base
                                → Wiki page proposed

How It Works

I tap a Shortcut on my iPhone (or the NotePin records ambient)
Audio uploads to S3 via API Gateway + Lambda
S3 event triggers transcription via AWS Transcribe
Transcription is cleaned, chunked, and pushed to Quick's knowledge base
Next time I open Quick: "You had a voice capture about [topic]. Want me to create a wiki page?"

Total AWS cost: fractions of a cent per capture. The Lambda functions are trivial — 50 lines each. The value is in the loop closing: thought → capture → structured knowledge → actionable.

Honest Review: 7/10

What's Working

Structure — The schema enforces consistency. Pages are findable and useful.
Compounding — Week 2 answers are noticeably better than Week 1. It knows things.
Connected tools — Slack context enriches wiki pages. Calendar awareness enables real scheduling.
Action layer — It doesn't just know things; it does things. Drafts, slides, bookings.
Semantic search — "What did I decide about X?" actually works across the wiki.

What Needs Work

Cross-referencing — Not fully automatic yet. Some pages remain under-linked.
Health checks — Haven't implemented scheduled audits for stale/orphan pages.
Scheduled ingestion — No cron for "check these 5 RSS feeds daily." Manual trigger still.
Contradiction detection — Untested. What happens when new info conflicts with existing wiki pages?

What We Did BETTER Than Karpathy's Original Vision

Karpathy's Version	My Implementation
Manual prompting	Approval workflow (assistant proposes)
Read-only files	Action layer (produces output)
Local Obsidian only	Connected to Slack, Gmail, Calendar
No entity awareness	Auto-extracted knowledge graph
No voice input	Voice capture pipeline
No background work	Parallel background agents
Single-user IDE	MkDocs served + shareable

His vision is the blueprint. But it's read-only — files you query. Mine produces: drafts emails, creates slides, books meetings, writes blog posts. The wiki isn't just a reference; it's a source of action.

How You Can Start (5 Steps)

You don't need my full setup to get value. Here's the gradient:

The 5 Steps

Connect your tools — Slack, Gmail, Calendar, a local folder. This is 10 minutes in Settings.
Start talking — Memory compounds from Day 1. Every conversation teaches it about you.
Say "remember this" after important conversations — Explicit memory anchors.
Create ~/SecondBrain/ + SCHEMA.md — If you want to go deeper, give it structure.
Ask it a question that spans everything — "What were my key decisions last week?" You're hooked.

Three Levels of Commitment

Level	Effort	What You Get
🟢 Lazy	Just talk normally	Memory + knowledge graph compound silently
🟡 Medium	Wiki folder + SCHEMA.md	Structured, searchable, cross-referenced knowledge
🔴 Full	MkDocs + voice pipeline + background agents	Complete second brain with action layer

Start at 🟢. Seriously. The compounding happens whether you build infrastructure or not. The wiki structure just makes it visible and auditable.

Conclusion

Your AI shouldn't start from scratch every conversation.

The tools exist. The pattern is proven. Karpathy showed the architecture; Amazon Quick provides the runtime. The gap between "AI assistant" and "second brain" is just structure + persistence + connected tools.

Compounding > Retrieving. Every conversation, every ingested article, every approved wiki page makes the next interaction smarter. That's not retrieval — that's growth.

Start small. It compounds. That's the point.

One last thought — and this comes from Jocko Willink, not from AI research: Extreme Ownership applied to knowledge. If information is in your world — a Slack thread, a half-remembered conference talk, an idea on a morning walk — and you don't capture it, you don't own it. It owns you by being unavailable when you need it most.

Capture it. Structure it. Let it compound.

This post was drafted with the help of my Second Brain — pulling from wiki pages I'd built over the prior week. The irony isn't lost on me. That's the whole point.

Presented at DoiT's GenAI Community Skill Share, May 22, 2026. Thanks to the ~27 peers who asked sharp questions and pushed the thinking further.