Every conversation with an AI assistant starts from scratch. You explain your project, your preferences, your file structure, your goals. Again. And again. Every token spent re-explaining context is a token unavailable for the actual task.
There's a second problem. Even when the AI knows who you are and what you're working on, your research and accumulated knowledge lives scattered across bookmarks, PDFs, Slack threads, and half-finished Google Docs. Every time you need the AI to reason about something you've already read, you're re-feeding material it should already know.
Obsidian solves both of these problems. It's a folder of plain text files that Claude Code can read and write directly. No API wrappers. No database. No vendor lock-in. Just markdown files with good structure.
This guide covers two complementary patterns for using Obsidian with Claude. The first is a persistent memory system that gives Claude continuity across sessions. The second is an LLM Wiki, a pattern published by Andrej Karpathy in April 2026, where Claude builds and maintains a research knowledge base from raw sources. Together, they turn Obsidian into an AI operating system.
TL;DR
- Memory system: Structure your vault in three layers (working, episodic, semantic). Use index files instead of content dumps. One practitioner reported cutting token usage in half.
- LLM Wiki (Karpathy pattern): Feed raw sources into a
raw/directory. Claude compiles them into interlinked wiki pages it owns and maintains. At ~100 articles and ~400K words, complex multi-hop reasoning works without vector DBs or RAG.- Two systems, two purposes: Memory tells Claude who you are and what you're working on. The wiki gives Claude deep domain knowledge to reason from.
- Token efficiency: Index files, flat folder structures, on-demand loading, and strategic compaction keep costs manageable even with large vaults.
Why Obsidian
Obsidian stores everything as plain markdown files in a regular folder. No proprietary format. No cloud dependency. No API needed. Claude Code can already read and write to any directory you give it access to, which means an Obsidian vault is immediately available as both a memory system and a knowledge base.
Claude doesn't need a database. It needs well-organized text files with enough structure to find what's relevant without reading everything.
Compare the alternatives. Vector databases require embedding pipelines and retrieval tuning. Custom APIs add latency and maintenance. A folder of markdown files with good naming conventions and a one-page index just works. And as Karpathy pointed out: your data stays on your local machine. It's not locked in an AI provider's system. If you switch models next year, your vault goes with you.
Part One: Persistent Memory
The first pattern gives Claude continuity across sessions so you stop re-explaining yourself every time you start a new conversation.
The Three-Layer Memory Structure
Separate memory into three layers, each serving a different purpose:
Layer 1: Working Memory (What's Active Right Now). A single file. Call it _active-context.md or Mission Control.md. It answers one question: what is Claude working on right now, and what does it need to pick up where it left off? Keep it to five to ten lines of current priorities, open decisions, and blockers. This loads at the start of every session. If it's longer than a screen, it's costing tokens on context that doesn't change.
Layer 2: Episodic Memory (What Happened). Session logs and decision records. A running history of what was tried, what worked, and what didn't. These live in a sessions/ folder and Claude loads them only when it needs past context for a current decision. The key habit: at the end of each session, Claude writes a brief summary and updates the active context file. This is what makes the system learn across sessions instead of just storing information nobody reads.
Layer 3: Semantic Memory (What's True). Durable knowledge that doesn't change session to session. Preferences. Project architecture decisions. Style guidelines. Tool configurations. Once Claude knows your deployment process or your writing voice, it doesn't need to be told again. Tell it once, save it to semantic memory, and it persists across every future session.
The Index Pattern
This is the single biggest efficiency lever in the entire system.
The mistake most people make: giving Claude access to a vault and letting it read everything at startup. With 200+ notes, that burns hundreds of thousands of tokens before any real work begins.
The fix is an index file. Call it MEMORY.md. One page with one-line descriptions pointing to actual memory files:
- [User Role](user/role.md) — Senior L&D leader, SLC-based, job searching
- [Voice Rules](feedback/voice.md) — No em dashes, contractions by default
- [Deploy Process](tools/deploy.md) — FTP to Hostinger, IndexNow after
- [Current Sprint](project/april-sprint.md) — 4 articles, playbook launch Claude reads the index first. The one-line descriptions provide enough context to decide which files are relevant to the current task. Full files load only when needed. One practitioner reported cutting token usage roughly in half with this approach.
Keep the index under 200 lines. One line per file.
Frontmatter That Helps Claude Decide
Every memory file should have YAML frontmatter:
---
name: Deploy Process
description: FTP deployment to Hostinger with IndexNow ping
type: reference
--- The description field is what Claude uses to decide whether to load the full file. Think of it like an email subject line. "Project notes" is useless. "April 2026 content sprint: 4 articles, playbook launch" tells Claude exactly when this file matters.
Types that work well: user (who you are, preferences, expertise), feedback (corrections and confirmed approaches), project (ongoing work, goals, timelines), reference (pointers to external systems and resources).
CLAUDE.md as the Boot Sequence
If you're using Claude Code, the CLAUDE.md file loads at the start of every session. Use it as the boot sequence, not the brain:
- CLAUDE.md = instructions (how to behave, what to read, what rules to follow)
- Obsidian vault = knowledge (what's true, what happened, what you're working on)
Point CLAUDE.md to the vault's index file. Tell Claude to read it at session start. The actual knowledge lives in the vault, not in CLAUDE.md. This keeps CLAUDE.md focused on behavior and prevents it from growing into a massive context dump that loads every session whether it's needed or not.
Part Two: The LLM Wiki
This pattern comes from Andrej Karpathy, who described it in April 2026. The core idea: stop treating AI as a search tool you query from scratch every time. Instead, feed it raw research materials and have it incrementally compile a structured, interlinked wiki. The LLM owns and maintains the wiki. You curate sources, ask questions, and direct the synthesis.
Karpathy's observation: "A large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge." That shift matters. The same LLM that writes your code can also organize, synthesize, and maintain your research, and it does it without fatigue.
The Architecture
The system has three layers, different from the memory layers above:
Raw sources (raw/) — Articles, papers, reports, images, data files. The LLM reads these but never modifies them. They're the immutable source of truth. Use the Obsidian Web Clipper extension to convert web articles to markdown. For PDFs, images, and other formats, drop them directly into the folder.
The wiki (wiki/) — LLM-generated markdown files: summaries, concept pages, entity pages, and synthesis articles. This layer is fully owned by the AI and evolves continuously. A single new source might trigger updates across 10-15 existing wiki pages as cross-references, summaries, and concept entries get revised.
The schema — A configuration document (in CLAUDE.md or a dedicated wiki-rules.md) defining wiki structure, naming conventions, and page formats. This ensures consistency as the wiki grows.
The Four Operations
Ingest. Drop a new source into raw/. Tell Claude to process it. Claude reads the material, writes a summary page, updates the index, creates or updates concept pages, adds cross-references to related entries, and logs the activity. One article can generate updates across a dozen wiki pages.
Query. Once the wiki reaches meaningful size, Claude can answer complex, multi-hop questions by researching across wiki pages. Karpathy found that at ~100 articles and ~400K words, he expected to need vector databases and RAG pipelines. He didn't. The LLM's ability to read index files and follow links handled it.
Lint. Run periodic health checks where Claude scans the wiki for inconsistencies, stale claims, orphaned pages, missing cross-references, and gaps that suggest new research directions. This is the maintenance work that kills most personal knowledge systems. People lose energy for it. LLMs don't.
File answers back. This is what makes the system compound. When Claude produces a useful synthesis in response to a question, file that answer back into the wiki as a new page. Every query adds to the knowledge base. The wiki gets smarter the more you use it.
Supporting Files
Two files keep the wiki navigable at scale:
index.md — A catalog of all wiki pages with one-line summaries, organized by category. Updated on every ingest. This is how Claude (and you) find things without reading everything.
log.md — An append-only chronological record of ingests, queries, and lint passes. Parseable with standard tools (grep "^## [" log.md | tail -5). Provides a timeline of how the wiki evolved and when specific knowledge was added.
Why This Works
Karpathy referenced Vannevar Bush's Memex concept from 1945: a personal, curated knowledge store with associative trails between documents. The idea has been around for 80 years. The unsolved problem was always: who maintains the system? Who keeps cross-references current? Who flags contradictions? Who fills gaps?
LLMs solve that. The tedious part of knowledge management isn't reading or thinking. It's the maintenance. That's exactly the kind of work LLMs handle well, at any scale, without burning out.
How the Two Systems Work Together
Memory and the LLM Wiki serve different purposes but share the same vault:
- Memory tells Claude who you are, what you're working on, and how you like to work. It's continuity and preferences.
- The Wiki gives Claude deep domain knowledge to reason from. It's accumulated understanding of a topic.
The combined vault structure:
vault/
memory/
MEMORY.md (index — loads at session start)
_active-context.md (working memory — 5-10 lines)
sessions/ (episodic — on demand)
user/ (semantic — on demand)
feedback/ (semantic — on demand)
project/ (semantic — on demand)
research-topic-a/
raw/ (immutable sources)
wiki/ (LLM-maintained pages)
index.md (wiki catalog)
log.md (activity log)
research-topic-b/
raw/
wiki/
index.md
log.md
CLAUDE.md (boot sequence — behavior rules) The memory system loads at session start (index only). Wiki directories load on demand when you're working on a topic that needs them. CLAUDE.md tells Claude where everything lives and how to interact with each layer.
Token Optimization
These strategies apply to both systems:
Index files, not content dumps. Both systems use the same pattern: a lean index that Claude reads first, full files only when relevant. This is the single biggest efficiency lever.
Flat folder structure. Every level of nesting adds path tokens to every file reference. Research/topic.md costs less than Projects/Active/Q1/Research/Sources/topic.md. Stay at two to three levels of depth.
Strategic compaction timing. Claude Code compresses context automatically as the window fills. Be intentional about it: compact at logical breakpoints (after planning, after debugging, before switching focus). Tokens spent on old context are unavailable for the current task.
Skip MCP when direct access works. Several Obsidian MCP servers exist (Nexus, obsidian-claude-code-mcp, MCP Tools for Obsidian). These are useful when Claude Desktop or the API needs vault access. But Claude Code already has filesystem access. Adding an MCP server for a directory it can already read just adds tool schema overhead. Each registered tool consumes tokens on its definition in every message.
Don't duplicate what's already derivable. Memory shouldn't store code patterns, architecture visible in the codebase, or git history. The wiki shouldn't duplicate raw sources verbatim. Each layer stores only what can't be found elsewhere.
Tools and Plugins
The ecosystem as of early 2026:
- Obsidian Web Clipper — converts web articles to markdown for the
raw/directory. Essential for the wiki pattern. - Cortex (Obsidian plugin) — runs a Claude Code agent inside the vault's side panel. Read, write, and organize notes without leaving Obsidian.
- Nexus (MCP server, successor to Claudesidian) — full vault operations via MCP. Best for Claude Desktop, not Claude Code.
- MCP Tools for Obsidian (plugin by jacksteamdev) — semantic search and Templater integration via MCP.
- obsidian-mind (vault template) — pre-structured folders, CLAUDE.md conventions, and frontmatter patterns. Good reference for setup.
- claude-infinite-context (vault template) — global memory plus per-project isolation with minimal setup.
- Marp (markdown slides) — Karpathy uses this to render wiki content as presentations, viewable directly in Obsidian.
- Dataview (Obsidian plugin) — dynamic queries over frontmatter. Useful for building custom views as the wiki grows.
Start with direct file access and the index pattern. Add tools when you hit a specific limitation, not before.
Anti-Patterns
Loading the full vault at session start. The single biggest token waste. Index first, load on demand.
Treating Obsidian as a database. It's a folder of text files. It works for retrieval and note-taking. It breaks with concurrent writes from parallel agents, complex queries, or transactional integrity.
Manually editing the wiki. This is a key principle from Karpathy's design: the wiki is the LLM's domain. Curate sources and ask questions. The moment you manually edit wiki pages, you create inconsistencies the LLM doesn't know about. If something needs correcting, tell the LLM to fix it.
Deep folder nesting. More path depth means more tokens per file reference. Two to three levels.
No compaction strategy. Context fills up, Claude loses earlier decisions, and you end up repeating yourself. One community report documented 80,000+ tokens wasted on a single repeated error because earlier context was pushed out of the window.
Parallel agents editing the same files. Without coordination, merge conflicts waste ~15,000 tokens per incident on recovery alone.
Skipping the log. Without log.md, you lose the ability to trace how the wiki evolved or when specific knowledge was added. It costs almost nothing to maintain and pays off when you need to verify a claim.
Getting Started
Pick the system that matches your immediate need, or set up both:
For Memory (Stop Re-Explaining Yourself)
- Create a
memory/folder in your vault. Add it to Claude Code's accessible directories. - Create
MEMORY.mdas the index. Start with five to ten entries covering your most frequently re-explained context. - Create
_active-context.mdwith current priorities. Keep it under ten lines. - Add a line to CLAUDE.md telling Claude to read the index and active context at session start.
- Build the habit: end every session with a handoff where Claude updates active context and writes a session summary.
For the LLM Wiki (Build a Research Knowledge Base)
- Create a topic directory with
raw/andwiki/subdirectories. - Drop five to ten source articles into
raw/(use Web Clipper for web content). - Write a brief schema in CLAUDE.md or
wiki-rules.mddefining wiki conventions: page format, naming, frontmatter fields. - Tell Claude to process the raw sources: create summaries, identify key concepts, build an index, and cross-reference pages.
- Start querying. File useful answers back into the wiki. Run periodic lint passes to catch inconsistencies.
Neither system requires plugins, MCP servers, or vector databases. Just markdown files with good structure.
The system compounds. Every session adds to the memory that makes the next session more efficient. Every source adds to the wiki that makes the next question easier to answer. That's the whole point.
—Eian
Sources & Further Reading
- Karpathy, A. (2026). LLM Wiki. GitHub Gist
- Karpathy, A. (2026). LLM knowledge bases. X/Twitter
- Chase AI. (2026). Claude Code + Obsidian: Persistent Memory That Works. chaseai.io
- MindStudio. (2026). Self-Evolving Claude Code Memory System with Obsidian Hooks. mindstudio.ai
- MindStudio. (2026). What Is Andrej Karpathy's LLM Wiki? mindstudio.ai
- obsidian-mind vault template. GitHub
- claude-infinite-context. GitHub
- Cortex plugin for Obsidian. Obsidian Forum
- MCP Tools for Obsidian. GitHub