ai: wire arcadia-search backend (search_kb + read_chunk + reindex button)

Adds the agent-facing surface for the new Tantivy lexical search service
(arcadia-search). Sits alongside the existing search_docs (browser
MiniSearch) — agent picks based on tool description.

- admin-tools.ts: new search_kb(query, corpus, limit?, tags?) and
  read_chunk(chunk_id, corpus) tools. KB_BASE_URL honors
  window.__ARCADIA_SEARCH_URL runtime override + VITE_ARCADIA_SEARCH_URL
  build env, defaults to localhost:7800. Token resolved per-call from
  sessionStorage.arcadia_access_token (matching lib-arcadia-client's
  storage convention) with "dev" fallback for unauthenticated dev.
- assistant.tsx: system-prompt section telling the agent when to pick
  search_docs (browser, bundled) vs search_kb (server, dynamic +
  expandable via read_chunk).
- ai.tsx: reindexKB() helper + "reindex kb (docs)" button on the empty
  state, next to the existing block-preview button. Toasts on
  start/success/failure. Wired with data-action="kb-reindex-docs" so
  the agent can also trigger via the command bus.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
jules
2026-05-03 21:41:13 +10:00
parent 49a9b019fc
commit f5189305c7
3 changed files with 190 additions and 1 deletions

View File

@@ -111,6 +111,48 @@ function ToolResultBlock({ name, result }: { name: string; result: unknown }) {
return <div className="px-1">{rich}</div>
}
// Trigger a server-side rebuild of an arcadia-search corpus. Reads the
// same KB URL + token resolution as the search_kb tool (see admin-tools.ts).
// Surfaces success/failure via the existing toast provider.
async function reindexKB(
corpus: string,
toast: ReturnType<typeof useToast>,
): Promise<void> {
const baseUrl =
(typeof window !== "undefined" &&
(window as unknown as { __ARCADIA_SEARCH_URL?: string }).__ARCADIA_SEARCH_URL) ||
"http://127.0.0.1:7800"
const token =
(typeof window !== "undefined" &&
window.sessionStorage.getItem("arcadia_access_token")) ||
"dev"
const url = `${baseUrl}/index/${encodeURIComponent(corpus)}/build`
toast.show?.({
title: "Reindexing…",
description: `Rebuilding corpus '${corpus}'.`,
})
try {
const res = await fetch(url, {
method: "POST",
headers: { Authorization: `Bearer ${token}` },
})
if (!res.ok) {
throw new Error(`HTTP ${res.status}: ${await res.text()}`)
}
const out = (await res.json()) as { chunk_count: number; built_at: string }
toast.show?.({
title: "Reindex complete",
description: `${out.chunk_count} chunks indexed for '${corpus}'.`,
})
} catch (err) {
toast.show?.({
title: "Reindex failed",
description: err instanceof Error ? err.message : String(err),
tone: "error",
})
}
}
// Synthetic assistant message that exercises every typed rich-output block.
// Wired to the "preview rich-output blocks" button in the empty state — used
// to eyeball renderer + theme without driving a live model. Safe to delete
@@ -1179,7 +1221,7 @@ function ChatSurface({
Issue an instruction. Read tools run automatically. Writes pause for
confirmation. Tab&nbsp; for command palette.
</p>
<div className="console-empty-line pointer-events-auto">
<div className="console-empty-line pointer-events-auto flex flex-wrap gap-2">
<button
type="button"
onClick={() =>
@@ -1191,6 +1233,14 @@ function ChatSurface({
>
preview rich-output blocks
</button>
<button
type="button"
onClick={() => void reindexKB("docs", toast)}
className="console-mono inline-flex items-center gap-1.5 rounded-md border border-[var(--console-rule-soft)] bg-transparent px-2.5 py-1 text-[10.5px] uppercase tracking-[0.18em] text-[var(--console-muted)] transition-colors hover:border-[var(--console-amber)] hover:text-[var(--console-amber)]"
data-action="kb-reindex-docs"
>
reindex kb (docs)
</button>
</div>
</div>
</div>

View File

@@ -113,6 +113,7 @@ function buildAdminPreface(activeAgent: Agent | undefined, uiControl: boolean):
const ctx = formatAdminContextForPrompt()
const parts = [
"You are the operator's assistant inside Arcadia Admin. Be precise and direct. You have native function tools attached to this conversation — call them whenever the user asks about live platform state (counts, statuses, listings, lookups). Never invent tenant slugs, user counts, or statuses; if you need data, call a tool.",
"Two retrieval surfaces exist for documentation/knowledge: `search_docs` (browser-side, BM25 over the bundled arcadia docs — fast, always available, small corpus) and `search_kb` (server-side, BM25 over arcadia-search — same docs as `corpus=docs` for parity, plus larger and additional corpora as the operator adds them). For questions about the bundled arcadia docs either is fine; prefer `search_kb` when you want richer hits or when the user is asking about content that wouldn't be in the bundled docs (uploaded files, tenant-specific knowledge). When `search_kb` returns a chunk_id you want to expand, call `read_chunk(chunk_id, corpus)`.",
RICH_OUTPUT_PREFACE,
ARCADIA_KNOWLEDGE,
persona,