RAG over private wiki with citations

All loops

RAGHard 20 min· claude-sonnet-4-5

RAG over private wiki with citations

Notion → pgvector → Claude with strict citation schema.

NOT DEPLOYEDNOT DEPLOYED

0189ms

Trigger

cron(0 7 * * *) fired · every day · 07:00

021269ms

Agent

claude-sonnet-4-5 · in 1169 tok · out 588 tok

03299ms

Tools

notion-mcp/postgres + pgvector:invoke → 200 OK · 229ms

0479ms

Verify

schema check · json-schema draft-2020 passed

05169ms

Output

12 chunks embedded · pgvector upsert

0649ms

Notify

audit log written · runbook link attached

SUCCESS

0 runs

P50

0ms

median

P95

0ms

tail

AVG COST

—

per run

LAST OK

never

—

LAST FAIL

never

none

Latency · last 30 runs0 samples

no runs yet

Latest output · what your users see

RAG index · kb-v3

chunks

faithfulness

0.97

recall

0.87

Q: How do I rotate a secret without restarting the loop?
A: Call `locker rotate <name>` — running loops pick up the new value on their next tick via the signed refresh channel.

// press Test to run once · Watch live to keep streaming · Deploy to make it real

The problem

Off-the-shelf wiki search returns whole pages. Claude without retrieval hallucinates policies that don't exist.

The outcome

Ask any question in Claude Desktop. Get an answer where every sentence is followed by `[chunk:42]` linking back to the source paragraph in Notion.

Ingredients & skills

Secrets

ANTHROPIC_API_KEY
NOTION_TOKEN
DATABASE_URL
OPENAI_API_KEY

Providers

Anthropic
Notion
Postgres + pgvector
OpenAI embeddings

MCP servers

notion-mcp
postgres-mcp

#rag#mcp#pgvector#notion

How it works

Sync a Notion workspace into pgvector, expose `retrieve_chunks` as an MCP tool, and force Claude to cite every claim with a chunk id.

Step 1

1 — Schema

One table. Cosine distance index. 1536 dims for `text-embedding-3-small`.

migrations/001_chunks.sql

create extension if not exists vector;
create table chunks (
  id bigserial primary key,
  notion_page text not null,
  body text not null,
  embedding vector(1536) not null
);
create index on chunks using ivfflat (embedding vector_cosine_ops) with (lists = 100);

Step 2

2 — Sync + embed

Chunk on heading breaks (~600 tokens). Embed in batches of 100.

typescript

for (const page of await listNotionPages()) {
  const chunks = splitByHeading(page.body, 600);
  const embeddings = await openai.embeddings.create({ model: "text-embedding-3-small", input: chunks });
  await db.query("insert into chunks (notion_page, body, embedding) select $1, unnest($2::text[]), unnest($3::vector[])", [page.id, chunks, embeddings.data.map((d) => d.embedding)]);
}

Step 3

3 — Expose retrieve_chunks via MCP

Claude Desktop calls this tool whenever it needs grounding.

mcp-server.ts

server.tool("retrieve_chunks", { query: z.string(), k: z.number().default(6) }, async ({ query, k }) => {
  const [e] = (await openai.embeddings.create({ model: "text-embedding-3-small", input: [query] })).data;
  const { rows } = await db.query("select id, body from chunks order by embedding <=> $1::vector limit $2", [e.embedding, k]);
  return { content: [{ type: "text", text: rows.map((r) => `[chunk:${r.id}] ${r.body}`).join("\n\n") }] };
});

Step 4

4 — System prompt for citations

Hard rule: every sentence ends with a `[chunk:N]` tag or you didn't say it.

text

You answer ONLY from the chunks retrieved.
Every sentence MUST end with [chunk:N] where N is the source chunk id.
If no chunk supports a claim, say "I don't know."

One-line deploy

The button above runs the same command with your saved config. This is the raw CLI form.

bash

locker deploy wiki-rag --bind notion-mcp --bind postgres-mcp

Related loops

Productivity

Internal search MCP for your tools

One MCP server that searches Notion + Linear + Slack from inside Claude Desktop.

RAG

pgvector quickstart

From plain Postgres to a working RAG endpoint in 10 minutes.

RAG

Hybrid search

Combines BM25 + dense embeddings with reciprocal rank fusion.