Knowledge as compiled code: building toward an LLM Wiki
The public conversation named an ideal in 2026. We had been building our own version for months. Here is what shipped, what we removed, and what is still hard.
Specify team12 min read
1. The idea
An LLM wiki is the bet that much of the bookkeeping of knowledge work can move earlier: when content arrives, not only when someone asks a question. Public write-ups in 2026 crystallized the phrase and the shape people want: less one-off chunk retrieval, more durable structure.
The tedious part of knowledge work is not reading or thinking — it is bookkeeping. LLMs are unusually good at bookkeeping when you give them the right interfaces.
Specify aligns with the pain (stateless assistants, siloed docs) even when the shipped mechanics are narrower than the full ideal.
2. Why naive RAG falls short
Naive RAG is often “embed chunks, top-k at query time.” It works for many questions, but it does not automatically create a team-scale asset: the corpus can grow while the experience still feels like starting from zero each session.
We still use retrieval. RAG is not “wrong”; the product story is about where indexing happens and what the agent can lean on across sessions.
3. What “compiled at ingest time” means for us
For Specify, “compile” means the index is ready before the question: sources and workspace documents are chunked, embedded, and loaded into a knowledge base so retrieval has something stable to query. We do not claim an ontology merge pipeline in the product today.
onWorkspaceContentChanged(doc):
chunks = chunk(doc)
vectors = embed(chunks)
index.upsert(workspaceId, vectors) // Bedrock KB / batch path
# no product graph.merge in shipped pathsAs sources accumulate, the workspace context RAG can see grows — without promising automatic typed cross-edges in the product.
4. The architecture we ended up with
Sources Index / Embed Retrieve Agent ┌──────────┐ ┌─────────────┐ ┌──────────┐ ┌─────────┐ │ GitHub │ ingest │ Workspace │ query │ Bedrock │ tool │ Claude │ │ Notion │────────►│ docs/chunks │───────►│ KB RAG │──────►│ + tabs │ │ Web │ │ (vectors) │ │ chunks │ │ context │ └──────────┘ └─────────────┘ └──────────┘ └─────────┘
Bedrock for chat and retrieval, Amplify Gen2, Lambdas for embedding and ingestion. No separate graph database feature path in the product — the hero graph mock is decorative, not live data.
5. Three problems we have not solved
Citation vs synthesis
Chunks can be retrieved and the model can still over-generalize. We push citations, but the UX contract for “evidence only” answers is unfinished.
Stale index
Docs change faster than embeddings. Re-indexing cadence and “fresh enough” semantics are still product decisions more than engineering trivia.
Team permission × retrieval scope
Workspace boundaries help; fine-grained row ACL intersecting with RAG hits is still ambiguous for many teams.
6. What surprised us
The third connected source is where teams stop re-explaining background — not the first. Compound context is felt operationally before it shows up in metrics dashboards.
7. Try it
If you want this shape in production, you can try Specify today.