assistant: teach the agent about Search admin

Bring the LLM agent's prompts and tools current with the new /search
section and arcadia-search admin sidecar:

- New tools in admin-tools.ts:
  - list_search_corpora: enumerate tenants + corpora with build status,
    so the agent can pick a real corpus instead of guessing.
  - rebuild_search_corpus(tenant, corpus): isWrite=true, surfaces a
    confirm card. Use after uploads or when results look stale.
- search_kb description updated: names docs / operator-tools / files
  explicitly, and points at list_search_corpora when unsure.
- ARCADIA_KNOWLEDGE: adds search-corpus terminology, /search route,
  and a one-liner pointer to the three new tools.
- assistant.tsx UI_CONTROL_PREFACE: nav-search added, full Search
  page action catalog (search-refresh / -restart / -new-tenant /
  -new-corpus, corpora-search, per-row corpus-{t}-{c}-{rebuild,edit,
  delete,actions}, tenant-{id}-delete, dialog form fields). Recipe
  for the manual rebuild path, plus a note steering the agent to
  the rebuild_search_corpus tool by default.
- search.tsx publishes a "search" surface to admin-context with
  tenants + corpora summary, so the agent gets live state without
  needing a tool call when /search is mounted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
jules
2026-05-04 19:17:12 +10:00
parent eb7bc62d14
commit d1469059d8
4 changed files with 105 additions and 4 deletions

View File

@@ -30,6 +30,7 @@ import { listRoles } from "~/lib/arcadia/roles"
import { revokeUserApiKey } from "~/lib/arcadia/api-keys"
import { createRAGClient } from "@crema/lexical-rag-ui"
import { BLOCK_INDEX, getBlockSchema } from "~/lib/block-schemas"
import { searchAdmin, SearchAdminError } from "~/lib/search-admin"
// Lazy singleton — first tool call fetches /docs-index.json, subsequent
// calls reuse the parsed MiniSearch instance.
@@ -732,7 +733,7 @@ const TOOLS: ToolDef[] = [
{
name: "search_kb",
description:
"Lexical (BM25) search over the arcadia-search Tantivy backend. Use for the LARGER, server-hosted knowledge corpora — the same arcadia docs the browser RAG serves are indexed here as `corpus=docs` for parity, and additional corpora (uploaded files, runbooks, etc.) will land here as they're added. Returns chunks with snippets + chunk_ids that can be passed to `read_chunk` to expand. Prefer this over `search_docs` (browser) when you need richer hits or when the user is asking about content that wouldn't be in the bundled docs (e.g. uploaded files).",
"Lexical (BM25) search over the arcadia-search Tantivy backend. Returns chunks with snippets + chunk_ids that can be passed to `read_chunk` to expand. Prefer this over `search_docs` (browser) when you need richer hits or when the content wouldn't be in the bundled docs.\n\nKnown corpora on the platform-admin tenant:\n- `docs` — arcadia-app architecture/ops docs (same as the browser RAG, server-hosted for parity).\n- `operator-tools` — arcadia-search + arcadia-admin documentation (admin sidecar, deploy script, search admin UI, MULTI_TENANT, RAG, AI_FIRST, LIBS, LLM_PROXY_CONTRACT).\n- `files` — markdown/text files uploaded by tenant users via arcadia-app.\n\nIf you're not sure what's available, call `list_search_corpora` first. Operators can add new corpora via the `/search` route.",
parameters: {
type: "object",
properties: {
@@ -740,7 +741,7 @@ const TOOLS: ToolDef[] = [
corpus: {
type: "string",
description:
"Which indexed corpus to search. `docs` is the parity corpus (arcadia documentation). New corpora are added by the operator.",
"Which indexed corpus to search. See list_search_corpora for the live set; common values: `docs`, `operator-tools`, `files`.",
},
limit: {
type: "integer",
@@ -796,6 +797,84 @@ const TOOLS: ToolDef[] = [
return result
},
},
{
name: "list_search_corpora",
description:
"Enumerate the corpora currently configured on the arcadia-search admin sidecar. Returns each tenant's corpora with build status (indexed?, num_docs). Call this when you don't know what corpora exist before invoking `search_kb`, or when the user asks what knowledge is available. Requires the search admin token to be configured.",
parameters: {
type: "object",
properties: {},
additionalProperties: false,
},
isWrite: false,
run: async () => {
try {
const tenantsRes = await searchAdmin.listTenants()
const tenants = await Promise.all(
tenantsRes.tenants.map(async (t) => {
try {
const c = await searchAdmin.listCorpora(t.id)
return {
tenant: t.id,
corpora: c.corpora.map((cc) => ({
corpus: cc.corpus,
indexed: cc.indexed,
num_docs: cc.num_docs,
})),
}
} catch {
return { tenant: t.id, corpora: [] }
}
}),
)
return { tenants }
} catch (err) {
if (err instanceof SearchAdminError) {
return {
error: `search-admin ${err.status}: ${err.message}`,
hint: "VITE_ARCADIA_SEARCH_ADMIN_TOKEN may be unset, or the sidecar (default :7801) may be down.",
}
}
throw err
}
},
},
{
name: "rebuild_search_corpus",
description:
"Trigger a synchronous rebuild of one corpus on arcadia-search. Use when the operator says the index is stale, after they've uploaded new files, or when search_kb returned suspiciously few/old hits. Returns chunk_count and built_at on success. The operator confirms before the rebuild runs (rebuilds can take secondsminutes depending on corpus size).",
parameters: {
type: "object",
properties: {
tenant: {
type: "string",
description: "Search tenant id (e.g. `platform-admin`). See list_search_corpora for available tenants.",
},
corpus: {
type: "string",
description: "Corpus name within that tenant (e.g. `docs`, `operator-tools`, `files`).",
},
},
required: ["tenant", "corpus"],
additionalProperties: false,
},
isWrite: true,
run: async (args) => {
const tenant = typeof args.tenant === "string" ? args.tenant.trim() : ""
const corpus = typeof args.corpus === "string" ? args.corpus.trim() : ""
if (!tenant || !corpus) {
throw new Error("rebuild_search_corpus requires { tenant, corpus }")
}
try {
return await searchAdmin.rebuild(tenant, corpus)
} catch (err) {
if (err instanceof SearchAdminError) {
return { error: `search-admin ${err.status}: ${err.message}` }
}
throw err
}
},
},
{
name: "get_block_schema",
description: `Fetch the full JSON schema + example for a rich-output block kind so you can emit it correctly in your reply. Call this the first time in a thread that you intend to render a particular kind. Available kinds: ${Object.entries(

View File

@@ -16,6 +16,7 @@ Core entities and how they relate:
- **Audit log entry** — append-only record of who did what. \`actor_type\` is one of: \`user\`, \`platform_admin\`, \`api_key\`, \`system\`. Per-tenant and platform-wide entries coexist.
- **Feature flag** — boolean / variant gate. Platform-wide default + per-tenant override.
- **Storage / billing config / SSO IdP / inbound webhook / API quota / data retention policy / approval workflow / announcement** — per-tenant or platform-level configurations the operator can manage.
- **Search corpus** — a Tantivy index over a set of source documents, served by the arcadia-search service. Each corpus belongs to a search tenant (a separate id space from platform tenants — typically \`platform-admin\` for the operator's own knowledge). The operator manages corpora at \`/search\`: create/edit configuration JSON, rebuild on demand, restart the service. Built-ins on \`platform-admin\`: \`docs\` (arcadia architecture), \`operator-tools\` (arcadia-search + arcadia-admin docs), \`files\` (uploaded markdown/text files).
Tenant lifecycle (status field):
@@ -31,6 +32,7 @@ Things to keep in mind when assisting:
- The operator can impersonate tenant users for debugging (POST /api/v1/admin/impersonate/:user_id) — surface this when they ask "why can't user X log in".
- Quotas / rate cards / billing config errors usually surface as 402/403 from /api/v1 endpoints — diagnose by checking the tenant's billing-config and api-metering quotas.
- The reference Phoenix app lives at \`reference/arcadia-app/\` in the workspace; its OpenAPI spec is at /api/openapi (sync via \`node ../lib-arcadia-client/scripts/sync-spec.mjs\`).
- Search admin (arcadia-search) is a separate service. Manage tenants/corpora at \`/search\`. Use \`list_search_corpora\` if you don't know what's indexed; \`rebuild_search_corpus\` after uploads or when results look stale; \`search_kb\` / \`read_chunk\` to query.
When the user asks something that maps to a tool, call it. When they ask about a concept, explain it from this primer in plain language. Write tools (suspend_tenant, activate_tenant) prompt the operator with an inline confirm card before they actually run — you do not need to ask in prose first; just call the tool and the user will see the confirmation UI. If the user denies a write, do not retry it; ask what they'd like to do differently.