diff --git a/docs/RAG.md b/docs/RAG.md new file mode 100644 index 0000000..7da9af9 --- /dev/null +++ b/docs/RAG.md @@ -0,0 +1,173 @@ +# Retrieval-Augmented Generation in arcadia-admin + +This app exposes **two** lexical RAG surfaces to the assistant. They +share a contract (`search` + `read`) but live at different layers and +serve different content. The agent picks between them based on tool +descriptions; the operator chooses which to deploy based on corpus +shape. + +--- + +## At a glance + +| | Browser RAG | Server RAG | +|---|---|---| +| Lib / service | `@crema/lexical-rag-ui` | `arcadia-search` (Rust) | +| Engine | MiniSearch (BM25, JS) | Tantivy (BM25, Rust) | +| Where it runs | In the user's browser | Sibling of arcadia-app | +| Index storage | Static JSON, fetched once | mmap'd disk, ~30–80MB resident | +| Practical corpus size | ~5–10MB / ~50–100k chunks | GB-scale, no hard cap | +| Update cadence | Static — rebuilt at app build time | Live — cron, webhook, or admin trigger | +| Auth | None (bundled with the app) | JWT (via arcadia's Guardian) | +| Tool the agent calls | `search_docs(query, limit)` | `search_kb(query, corpus, limit?, tags?)` + `read_chunk(chunk_id, corpus)` | +| Source content lives in | `arcadia-admin/public/docs-index.json` | `arcadia-search`'s data dir, ingested from disk or arcadia API | +| What's it best for | Static reference docs that ship with the app | Tenant-uploaded files, audit log search, anything that grows | + +--- + +## When the agent picks each + +The system prompt in `app/routes/assistant.tsx::buildAdminPreface` tells +the model: + +> Two retrieval surfaces exist for documentation/knowledge: +> `search_docs` (browser-side, BM25 over the bundled arcadia docs — +> fast, always available, small corpus) and `search_kb` (server-side, +> BM25 over arcadia-search — same docs as `corpus=docs` for parity, +> plus larger and additional corpora as the operator adds them). For +> questions about the bundled arcadia docs either is fine; prefer +> `search_kb` when you want richer hits or when the user is asking +> about content that wouldn't be in the bundled docs (uploaded files, +> tenant-specific knowledge). When `search_kb` returns a `chunk_id` +> you want to expand, call `read_chunk(chunk_id, corpus)`. + +In practice DeepSeek + V3 picks `search_kb` for anything that mentions +"the kb" or sounds dynamic, and `search_docs` for quick lookups against +the bundled docs. Neither pick is wrong for content that exists in +both. + +--- + +## Browser RAG (`@crema/lexical-rag-ui`) + +**What it is.** A small React + MiniSearch wrapper. The lib provides +`RAGProvider`, `useRAG`, and a headless `createRAGClient(indexUrl)`. +The index is a single JSON file built offline by +`scripts/build-docs-index.mjs` and shipped in the app's `public/`. + +**How it's wired here.** + +- Build script: `arcadia-admin/scripts/build-docs-index.mjs` reads + markdown from `../reference/arcadia-app/`, chunks at H1–H3, + produces `public/docs-index.json`. Runs on `npm run build:docs` + (and as the `prebuild` step before `npm run build`). +- Tool wrapper: `app/lib/admin-tools.ts` constructs a singleton + `createRAGClient("/docs-index.json")` and exposes it as the + `search_docs` tool. The tool returns hits with the legacy + `category` field collapsed back from `tags[0]` so the agent's + prior expectations stay stable. +- Storage: just the static JSON. No state, no auth, no indexer + process. + +**Limits.** Practical ceiling is ~5–10MB index. Past that, first-load +parse and browser memory get painful (200MB+ heap on a 50MB index; +mobile breaks). Updates require a build step + redeploy. + +**Why it exists.** Static reference content that ships with the app — +arcadia's own docs, in this case. Always available even if the search +service is down. Zero infrastructure. + +For the lib itself see `lib-lexical-rag-ui/README.md`. + +--- + +## Server RAG (`arcadia-search`) + +**What it is.** A standalone Rust HTTP service (Tantivy + axum). Single +static binary, ~30–80MB resident. Per-tenant per-corpus indexes on +disk. JWT auth, HMAC webhook intake, atomic rebuild swap, systemd +timer cron. + +**How it's wired here.** + +- Tools: `app/lib/admin-tools.ts` exposes `search_kb` and + `read_chunk`. The fetch URL is `KB_BASE_URL` (default + `http://127.0.0.1:7800`, override via `window.__ARCADIA_SEARCH_URL` + or `VITE_ARCADIA_SEARCH_URL`). The bearer token is the user's + arcadia JWT from `sessionStorage["arcadia_access_token"]`, with a + `"dev"` fallback when no login. +- Reindex button: `app/routes/ai.tsx::reindexKB` calls + `POST /index/:corpus/build` and toasts the result. Lives in the + AI page's empty state next to the block-preview button. +- System prompt: see the snippet in `assistant.tsx::buildAdminPreface` + above. + +**Storage.** `///current/` per index; +`previous-/` for the last few rebuilds (rollback). Sources can +be on-disk markdown, or pulled from arcadia's `/api/v1/digital_objects` +API (see `arcadia-search/MULTI_TENANT.md` and `ARCADIA_INTEGRATION.md`). + +**Update cadence.** Three triggers, layered so each compensates for +the others' failure modes: +- **Cron** (systemd timer, hourly default) — always-on safety net. +- **Admin button** — one-click rebuild from the AI page. +- **Webhook** — arcadia POSTs `/events/changed` on file create/delete; + search debounces (2-min default) and rebuilds. + +**Why it exists.** Anything that doesn't fit the browser ceiling: +tenant-uploaded files, audit-log-ish content, multi-tenant knowledge +bases, anything that grows over time. + +For the service see `arcadia-search/README.md`. +For multi-tenant config see `arcadia-search/MULTI_TENANT.md`. +For the upstream arcadia integration story (file content fetch, +text extraction, webhook signature, service tokens) see +`arcadia-search/ARCADIA_INTEGRATION.md`. + +--- + +## How they coexist + +The default deploy runs **both**: + +- `search_docs` indexes the same arcadia-app docs the parity corpus + on `arcadia-search` indexes. Same content, two engines. +- This is intentional — it means the assistant always has *something* + to search, even if `arcadia-search` is down or unreachable. The + failure mode is "no `search_kb`, but `search_docs` still works." +- It also gives a permanent A/B regression test: query both, compare + hits, catch relevance regressions in either engine. + +--- + +## Picking ONE for a new corpus + +Use this checklist when adding new content: + +| Question | Answer → use | +|---|---| +| Is the corpus < 5MB and basically static? | Browser | +| Does it need to update without a redeploy? | Server | +| Is it per-tenant content (uploaded files, tenant-specific KB)? | Server | +| Are you OK shipping it in the JS bundle? | Browser | +| Does it need agentic `read_chunk` follow-up? | Server (browser doesn't expose `read` over the tool surface) | +| Does it need to work offline / with no backend? | Browser | +| Is it growing > 10MB? | Server | + +Most "knowledge base" content lives in the server side. The browser +side is reserved for the always-bundled reference material that ships +with the app. + +--- + +## What lives where (cheat sheet) + +| Want to | Look at | +|---|---| +| Add a doc to the bundled browser RAG | `arcadia-admin/scripts/build-docs-index.mjs` (extend `SOURCES`) | +| Add a tool to the agent | `arcadia-admin/app/lib/admin-tools.ts` | +| Change the LLM's tool-picking guidance | `arcadia-admin/app/routes/assistant.tsx::buildAdminPreface` | +| Add a corpus to arcadia-search | `arcadia-search/deploy//.config.json` + new systemd timer | +| Add a tenant to arcadia-search | `arcadia-search/MULTI_TENANT.md` | +| Wire an arcadia file → search ingest | `arcadia-search/ARCADIA_INTEGRATION.md` (needs upstream changes first) | +| Reindex the server-side corpus right now | "reindex kb (docs)" button on `/ai` empty state, or `POST /index/docs/build` |