Need AI Development or Sponsor Exposure?

We help companies build AI systems and reach AI readers.

AI Development Become Sponsor

Andrej Karpathy’s latest concept ‘LLM Wiki’ and the future of enterprise knowledge

Executive summary

Andrej Karpathy’s public GitHub Gist, published on April 4, 2026, describes LLM Wiki not as a finished product, but as an “idea file” for agentic knowledge work: instead of re-retrieving raw fragments on every question, an LLM incrementally compiles curated source material into a persistent, interlinked Markdown wiki, guided by a schema file such as CLAUDE.md or AGENTS.md. In Karpathy’s pattern, the core stack is raw sources as immutable truth, a wiki as the maintained knowledge layer, and a schema as the operating contract for ingest, query, and maintenance. The practical promise is compounding knowledge: each ingest and each good answer can strengthen the corpus rather than vanish into chat history. (1)

The investor page from treats that pattern as a transition point rather than an endpoint. Its Self-Organizing Wiki vision keeps original enterprise documents as the source of truth, treats the LLM Wiki as an AI-generated structured knowledge layer, and adds ConceptMiner plus a ThinkNavi interface to discover concept clusters, bridge concepts, structural analogies, and knowledge gaps. Read conservatively, this is a roadmap and market thesis, not proof of a fully shipped enterprise platform. Analytically, the strongest conclusion is that LLM Wiki is best seen as a knowledge-compilation layer that complements retrieval, while Self-Organizing Wiki is an attempt to add structural discovery and associative exploration on top of that layer. (1)

What LLM Wiki is

Karpathy’s core claim is simple: most document-centric LLM workflows still behave like retrieval-augmented generation, where the model finds relevant chunks at query time and reconstructs the answer from scratch each time. LLM Wiki shifts the heavy knowledge work earlier. A new source is not just indexed; it is read, summarized, linked into existing entity and concept pages, reconciled against prior claims, and recorded in a persistent wiki that the human primarily reads while the model primarily maintains. Karpathy’s three-layer architecture is explicit: raw sources are immutable and remain the source of truth; the wiki is an LLM-generated Markdown layer composed of summaries, entity pages, concept pages, comparisons, overviews, and syntheses; the schema tells the agent how to structure the wiki and how to execute ingest, query, and maintenance workflows. (1)

That architecture also defines a division of labor. Karpathy’s formulation is that the human curates sources, explores, and asks the right questions, while the LLM performs the persistent bookkeeping: summarizing, cross-referencing, updating, and filing. He explicitly lists personal self-tracking, topic research, book companions, business/team knowledge, and due diligence as candidate use cases, and he frames Markdown plus local browsing tools as the working environment in which the wiki becomes a durable artifact rather than a transient chat output. (1)

How it differs from RAG

Conventional RAG, in official documentation, processes a knowledge base into searchable vector or other retrieval structures and, when a user asks a question, retrieves relevant passages and provides them to the model to ground the answer. LLM Wiki changes the timing of synthesis. The main distinction is therefore ingest-time compilation versus query-time assembly. In RAG, freshness is relatively strong because new documents can be retrieved as soon as they enter the index. In LLM Wiki, coherence is stronger because cross-document synthesis can already exist before the question is asked, but freshness depends on how quickly the wiki is re-ingested and maintained. (2)

This timing difference clarifies the comparison with nearby systems. GraphRAG still centers on an indexing pipeline plus a query engine: it extracts entities, relationships, claims, community summaries, vectors, and then answers questions over completed indexes through local, global, DRIFT, or basic search. That is more structured than baseline RAG, but still fundamentally query-oriented. NotebookLM remains source-grounded at answer time: its help pages emphasize inline citations, source selection, and grounded answers based on uploaded sources or static source copies. Projects in ChatGPT are different again: they are long-running workspaces that keep chats, files, instructions, memory, and tools together, but the official description does not position them as autonomous wiki compilers. So LLM Wiki is not simply “better RAG”; it is a different design target: persistent knowledge artifacts first, retrieval second. The last clause is a synthesis from the cited materials and should be read as an interpretive comparison. (3)

Architecture and operations

Karpathy’s public spec names three primary operations: ingestquery, and lint. Ingest reads a new raw source, discusses key takeaways with the human if desired, writes a summary page, updates relevant entity and concept pages, updates index.md, and appends an entry to log.md; Karpathy notes that a single source can touch 10–15 pages. Query starts from the index, reads relevant pages, synthesizes an answer with citations, and can save that answer back into the wiki as a new page. Lint performs health checks for contradictions, stale claims, orphan pages, missing concept pages, missing cross-references, and data gaps worth filling with new searches. index.md is content-oriented navigation; log.md is chronological, append-only operational history. Karpathy also notes that this works “surprisingly well” at moderate scale and that the wiki itself can live as a Git-backed Markdown repository, which naturally gives version history and rollback. (1)

What Karpathy’s Gist does not specify is equally important. It does not define a canonical parser, normalization stack, chunking policy, citation schema, merge algorithm, or audit model. So when people now talk about ingest internals such as parsingnormalizationchunking, and citation capture, those are mostly implementation details added by follow-on tools rather than fixed parts of Karpathy’s public spec (inference). Community implementations illustrate the gap. The llm-wiki-compiler project adds multimodal ingest, chunked retrieval with reranking, paragraph- and claim-level provenance, typed page kinds, candidate review queues, contradiction metadata, linting, and rollback-oriented roadmap items. adds log.md, structured JSON logs, and an append-only audit.db, with source hashes, cost records, job history, cache invalidation on file hash change, and audit queries. adds approval bundles, contradiction detection, provenance-tracked graph edges, context packs, and hybrid search. makes the vault structure itself explicit with raw/wiki/outputs/, and SCHEMA.md. (5)

The agent workflow implied by the public pattern is concrete enough to describe at file level. A typical Obsidian + agent loop is: use Obsidian Web Clipper to capture a web page into Markdown; save article content and metadata into raw/; let the agent generate or revise wiki/sources/<source>.mdwiki/concepts/<concept>.md, and wiki/entities/<entity>.md; update wiki/index.md and wiki/log.md; then inspect the graph and backlinks in Obsidian, where internal links are navigable and can auto-update when files are renamed. Karpathy explicitly describes browsing the results in real time with Obsidian open beside the agent. Obsidian’s Web Clipper supports templates, variables, and Markdown extraction of page content, while Graph view visualizes node-link relationships inside the vault.(1) 

With Claude Code, the wiki pattern maps naturally onto CLAUDE.md. The docs say Claude Code reads and edits files, runs commands, uses persistent CLAUDE.md instructions and auto memory, and asks for permission before modifying files. A practical LLM Wiki arrangement is therefore to store page naming rules, provenance rules, and update procedures in CLAUDE.md, then have the agent edit the relevant Markdown files and optionally run Git commands or lint tools. With Codex, the equivalent control layer is AGENTS.md: Codex reads those files before work, layers guidance from global to repo-local scope, can inspect repositories, edit files, run commands, and exposes action transcripts for review and Git-based rollback. An example such as “ingest raw/meeting-2026-05-07.md, update the decision page, revise index.md, append log.md, and show the diff” is therefore highly plausible, but the exact file conventions are still schema-dependent rather than standardized (inference). (6)

Why now and its limits

Why now

The concept is resonating now because agentic tools have become much better at cross-file reading, editing, command execution, and long-running project guidance. Karpathy’s own public follow-up says the pattern becomes especially interesting once the wiki is large enough—he gives an example of a research wiki with roughly 100 articles and 400,000 words. At the same time, open-source implementations and Hacker News threads appeared within days, which suggests the community sees the pattern as operationally buildable rather than merely rhetorical. (7)

Limits and enterprise challenges

The major criticism is not that the idea is worthless, but that it can be oversold. In the Gist comments, one detailed critique argues that once the wiki exceeds modest size, retrieval, ranking, indexing, reranking, chunking, and access control all come back; another warns that when the same process both reads and writes the knowledge base, “silent corruption” becomes a real risk. Other comments debate whether “wiki” is even the right term for a static or agent-maintained Markdown corpus, though defenders argue the more serious question is not naming but whether the system has citations, provenance, permissions, auditability, and editorial controls. These are community critiques, not Karpathy’s own claims, but they identify the load-bearing risks. (1)

For enterprise use, the mitigation pattern is fairly clear even if no single source yet defines a universal standard. A serious implementation should keep original sources immutable; attach paragraph- or claim-level provenance where possible; stage updates through approval queues rather than writing straight into authoritative pages; maintain append-only operation logs and source hashes; use Git or equivalent rollback; test update quality and contradiction handling; and separate access policy on raw sources from visibility of derived wiki pages. This mitigation bundle is best-practice synthesis rather than a single-source prescription (inference), but it follows directly from Karpathy’s raw/wiki/schema split, his use of log.md and Git, the community implementations’ review and audit features, and the investor-page emphasis on traceability, access rights, conflicting documents, and auditability. (1)

Self-Organizing Wiki as the next evolution

The investor page from explicitly frames a progression: RAG retrieves documents; LLM Wiki compiles knowledge into a persistent wiki; Self-Organizing Wiki organizes that wiki into conceptual maps and associative trails. Its layered model is explicit: Original enterprise sources remain the source of truth, the LLM Wiki layer is AI-generated structured knowledge, the ConceptMiner layer is a self-organizing conceptual map, and the ThinkNavi interface provides exploration, dialogue, synthesis, and decision support. That is an important architectural move because it refuses to treat the AI-generated wiki as the final truth layer in enterprise settings. (2)

ConceptMiner is the most distinctive part of the proposal. According to the investor page, it takes chunks, wiki pages, and generated conceptual descriptions, embeds them, and organizes them with a GNG+MST-based conceptual structure network. The declared outputs are not just nearest-neighbor search results but concept clusters, semantic neighborhoods, bridge concepts, knowledge gaps, duplicated or fragmented concepts, structural changes over time, hidden relationships, and areas where new hypotheses may emerge. It also proposes multiple representational models—Trigger/Situation/Motive, Logical Structure, Implication/Lesson—so that traversal can jump from topical similarity to structural analogy across domains. The page calls the end state “Enterprise Associative Memory.” This is conceptually ambitious, but it remains a public roadmap/positioning document, not an audited technical benchmark. Notably, the same page treats source traceability and audit features as part of the 6–12 month roadmap, so those capabilities should not be assumed to be fully mature today. (2)

Implications for enterprise knowledge management

The most useful enterprise reading is pragmatic. For personal work and slow-moving research, LLM Wiki is already compelling because it converts repeated synthesis into durable pages, reduces repeated rediscovery, and fits naturally with Markdown, links, local Git, and agentic editing. For team use, it becomes attractive when humans can review updates and when work benefits from persistent thematic pages rather than ephemeral chat results. For large enterprise deployments, however, the minimum bar rises sharply: provenance, access control, formal retention and deletion rules, snapshotting, rollback, approval queues, and clear separation between official knowledge and AI-derived interpretation become indispensable. That is exactly why the Mindware page treats the wiki as a middle layer rather than the final source of truth. The cleanest conceptual summary is Karpathy’s compilation layer first, then structural discovery above it, then governed enterprise use around both. (1)

Comparison table

The table below synthesizes Karpathy’s Gist, official documentation for RAG, GraphRAG, NotebookLM, projects in ChatGPT, and the Mindware investor page. The “best-fit scale” and parts of the governance row are analytical synthesis rather than direct one-line claims from any single source. (3)

DimensionRAGLLM WikiSelf-Organizing Wiki
Primary timingQuery-time retrievalIngest-time compilation, query-time reading of compiled pagesIngest-time compilation plus post-compilation concept modeling
Main artifactRetrieved passages and grounded answerPersistent Markdown wiki, index, log, derivative answer pagesLLM Wiki plus concept maps, associative trails, exploration interface
Source of truthExternal knowledge base / original documentsOriginal sources remain authoritative; wiki is maintained derivative layerOriginal enterprise sources explicitly remain the source of truth
TraceabilityOften strong at passage level if citations are exposedVaries by implementation; strongest when claim/range provenance is capturedIntended to preserve distinction between sources and AI-generated interpretation
FreshnessUsually strong if the index is updatedDepends on re-ingest cadence and maintenance disciplineDepends on source refresh plus concept-layer refresh
ScalabilityRetrieval infrastructure is matureNeeds search, ranking, and governance once the corpus growsAdds another abstraction layer, so governance complexity rises further
Governance profileFamiliar for enterprise search and grounded QARequires added review, audit, rollback, and permission controlsExplicitly designed around layered truth vs interpretation, but current public source is still roadmap-level
Best fitFAQ, grounded QA, rapidly changing corporaPersonal knowledge, research, expert-team synthesis, code/document compilersEnterprise KM, strategic research, meeting/customer intelligence, associative discovery

References

Primary and official sources

  • [Karpathy’sGitHub Gist「LLM Wiki」](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f?utm_source=chatgpt.com)
  • [Karpathy’sX message「LLM Knowledge Bases」](https://x.com/karpathy/status/2039805659525644595?utm_source=chatgpt.com)
  • [Mindware Research Institute page for investors](https://www.mindware-jp.com/en/for-investors/)
  • [IBMのRAG review](https://www.ibm.com/docs/en/watsonx/saas?topic=solutions-retrieval-augmented-generation)
  • [Google CloudのRAG review](https://cloud.google.com/use-cases/retrieval-augmented-generation)
  • [GraphRAG official document](https://microsoft.github.io/graphrag/?utm_source=chatgpt.com)
  • [Microsoft Research’s GraphRAG Blog](https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/?utm_source=chatgpt.com)
  • [NotebookLM official help「Use chat in NotebookLM」](https://support.google.com/notebooklm/answer/16179559?hl=en)
  • [ChatGPT Projects official help](https://help.openai.com/en/articles/10169521-projects-in-chatgpt)
  • [Obsidian official help「Import Markdown files」](https://obsidian.md/help/import/markdown)
  • [Obsidian official help]「Graph view」](https://obsidian.md/help/plugins/graph?utm_source=chatgpt.com)
  • [Obsidian official help「Web Clipper」](https://obsidian.md/help/web-clipper?utm_source=chatgpt.com)
  • [Claude Code officila document](https://code.claude.com/docs/en/overview?utm_source=chatgpt.com)
  • [Codex official document](https://developers.openai.com/codex?utm_source=chatgpt.com)
  • [CacheZero repository](https://github.com/swarajbachu/cachezero?utm_source=chatgpt.com)
  • [twillmrepository](https://github.com/Jermolene/twillm?utm_source=chatgpt.com)
  • [llmwiki repository](https://github.com/lucasastorian/llmwiki?utm_source=chatgpt.com)

Unresolved questions and assumptions

  • Direct source gap: Karpathy’s Gist defines the pattern, but it does not standardize parsing, normalization, chunking, source-range citation capture, or merge/conflict logic; those details in this report are therefore marked as inference when derived from community implementations rather than from the Gist itself. (1)
  • Direct source gap: No primary source in this review provides a controlled benchmark showing that LLM Wiki broadly outperforms RAG or GraphRAG across accuracy, cost, freshness, and governance. Any claim that LLM Wiki “replaces” RAG would therefore go beyond the evidence currently available. (1)
  • Roadmap caveat: The Mindware investor page is the primary public source for Self-Organizing Wiki, but it is clearly an investor/roadmap page. Statements about market direction and planned features can be cited directly; statements about shipped enterprise maturity should be treated cautiously. (2)
  • Terminology dispute: The debate over whether a Markdown-based, agent-maintained corpus should be called a “wiki” is real in public discussion, but it is not yet settled by any authoritative technical standard. That naming debate matters mostly for expectation-setting around collaboration, editorial control, and auditability. (1)

Summary

The deepest significance of LLM Wiki is not that it makes retrieval disappear. It is that it treats knowledge work as compilation and maintenance, not just search. Karpathy’s Gist gives a minimal but powerful public pattern for that shift. The investor page from pushes the idea into a more enterprise-oriented direction by separating original sources from AI interpretation and then adding conceptual structure discovery on top. If that roadmap holds, the next frontier in enterprise knowledge systems will not be “better answers” alone, but better knowledge layers: what is original, what is compiled, what is inferred, what is approved, and how all four stay traceable over time. 

  • tada@aicritique.org

    He has been a watcher of the industrial boom from the early 1980s to the present day. 1982, planner of high-tech seminars at the Japan Technology and Economy Centre, and of seminars and research projects at JMA Consulting; in 1986 he organised AI chip seminars on fuzzy inference and other topics, triggering the fuzzy boom; after freelance writing on CG and multimedia, he founded the Mindware Research Institute, selling the Japanese version of Viscovery SOMine since 2000, and Hugin and XLSTAT since 2003 in Japan. The AI portal site, www.aicritique.org was started in 2024 after losing the rights to XLSTAT due to a hostile takeover in 2023.

    Related Posts

    GPT-5.5 Is Real, Powerful, and Expensive — but OpenAI’s Biggest Story Is the Race to Own Enterprise AI Work

    On April 23, 2026, OpenAI formally launched GPT-5.5, ending weeks of rumor and leak-driven speculation with a release that is both more concrete and more restrained than some of the hype suggested. The model is official, it is rolling out first in…

    Claude Mythos and the New Cybersecurity Balance

    When Anthropic unveiled Claude Mythos Preview on April 7, it did more than announce a powerful new model. It effectively told governments, banks, cloud providers and software maintainers that a threshold had been crossed: frontier AI had become good enough at vulnerability…

    You Missed

    Andrej Karpathy’s latest concept ‘LLM Wiki’ and the future of enterprise knowledge

    Andrej Karpathy’s latest concept ‘LLM Wiki’ and the future of enterprise knowledge

    How to Build Enterprise AI

    How to Build Enterprise AI

    AI Developments in April 2026

    AI Developments in April 2026

    The Rise of the Context Layer: Why AI Agents Need More Than Data

    The Rise of the Context Layer: Why AI Agents Need More Than Data

    Comparison of Major Companies’ Computer Use Agents

    Comparison of Major Companies’ Computer Use Agents

    GPT-5.5 Is Real, Powerful, and Expensive — but OpenAI’s Biggest Story Is the Race to Own Enterprise AI Work

    GPT-5.5 Is Real, Powerful, and Expensive — but OpenAI’s Biggest Story Is the Race to Own Enterprise AI Work

    Claude Mythos and the New Cybersecurity Balance

    Claude Mythos and the New Cybersecurity Balance

    AI News Briefing for April 13–20, 2026

    AI News Briefing for April 13–20, 2026

    Current Research Trends in Latent Space

    Current Research Trends in Latent Space

    AI Patents from Google Patents Search

    AI Patents from Google Patents Search

    AI Articles from IEEE Xplore

    AI Articles from IEEE Xplore

    AI articles from OpenAlex

    AI articles from OpenAlex

    AI News from NewsAPI

    AI News from NewsAPI

    AI News from Google News

    AI News from Google News

    Idea of New AI services

    Idea of New AI services

    Problem to use AI services

    Problem to use AI services

    AI Services Market Structure 2026

    AI Services Market Structure 2026

    Why Conceptual Investigation?

    Why Conceptual Investigation?

    AI Development in March 2026

    AI Development in March 2026

    GPT-5.4 and the March 2026 ChatGPT Upgrade Cycle: Official Release, Media Narratives, and Real-World Reactions

    GPT-5.4 and the March 2026 ChatGPT Upgrade Cycle: Official Release, Media Narratives, and Real-World Reactions

    AI Agent Startups Trends 2023–2026

    AI Agent Startups Trends 2023–2026

    The Rise of Generative UI Frameworks in 2025–26

    The Rise of Generative UI Frameworks in 2025–26

    Will OpenAI Prism accelerate scientific research?

    Will OpenAI Prism accelerate scientific research?

    Considering AI and Communism

    Considering AI and Communism

    Order in the Age of AI

    Order in the Age of AI
    Need AI solutions or sponsorship opportunities? Get in touch