Wire AI assistant to arcadia: domain primer, tool calling, admin context

Make /ai and /assistant operate as the platform admin's assistant against arcadia-app's API: - Add `arcadia-knowledge.ts` — domain primer (multi-tenant Phoenix backend, tenant lifecycle, platform_admins identity, etc.) baked into every system prompt. - Add `admin-tools.ts` — curated tool registry exposing `list_tenants` and `get_tenant`, callable via OpenAI-native function calling. Tools hit arcadia through `useArcadiaClient()` and inherit the operator's JWT + tenant header. `runLLMToolCalls()` returns `tool` role messages ready to push back into history. - Add `admin-context.ts` — runtime registry pages publish to so the assistant can answer factual questions about live UI state without scraping the DOM. Tenants page registers its summary on mount. - Replace generic Vibespace personas (Atlas/Forge/Inkwell/Pilot/Cursor) with arcadia-flavoured ones: Operator, Auditor, Triage, Analyst, UI Operator. Auto-migrate stored agents from the legacy set. - /assistant: build admin preface (role + primer + persona + ctx) and pass it as the `useChat` system at construction. Pass `tools` on every `send()`. Auto-loop reads `toolCalls` off the streaming assistant message and uses `continueChat()` to push tool results. - /ai: same wiring (this is the canonical admin chat surface; the user prefers its look). - MessageBody renders tool-result cards (role: "tool") and a "Called X" pill on assistant messages with toolCalls. Strips Qwen-style `<tool_call>` XML from prose when the tags were converted to structured calls. - Extend ThreadMessage with the `tool` role + tool-call metadata so conversations round-trip through localStorage. - Tenants page: row actions get `data-action="tenant-<slug>-{suspend, activate,deactivate}"` (via lib-table-ui's new dataAction prop); registers tenant summary into admin-context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 20:08:47 +10:00
parent e7cb8c942b
commit fe93f2766c
9 changed files with 577 additions and 82 deletions
--- a/app/lib/admin-context.ts
+++ b/app/lib/admin-context.ts
@@ -0,0 +1,76 @@
+// Shared state surface that any admin page can publish to so the assistant
+// can read live data without scraping the DOM.
+//
+// Pages call `useRegisterAdminContext("tenants", { tenants: [...] })` while
+// mounted; the assistant calls `getAdminContextSnapshot()` each turn to
+// inject a structured snapshot into the system prompt.
+
+import { useEffect } from "react"
+
+type Surface = Record<string, unknown>
+
+export type AdminContextSnapshot = {
+  route: string
+  surfaces: Record<string, Surface>
+}
+
+const surfaces = new Map<string, Surface>()
+
+export function publishAdminSurface(name: string, data: Surface): void {
+  surfaces.set(name, data)
+  if (typeof window !== "undefined") {
+    ;(window as unknown as { __adminContext?: unknown }).__adminContext = getAdminContextSnapshot()
+  }
+}
+
+export function clearAdminSurface(name: string): void {
+  surfaces.delete(name)
+  if (typeof window !== "undefined") {
+    ;(window as unknown as { __adminContext?: unknown }).__adminContext = getAdminContextSnapshot()
+  }
+}
+
+export function getAdminContextSnapshot(): AdminContextSnapshot {
+  const route = typeof window !== "undefined" ? window.location.pathname : ""
+  return {
+    route,
+    surfaces: Object.fromEntries(surfaces.entries()),
+  }
+}
+
+/**
+ * Render a snapshot as a markdown block for the LLM system prompt.
+ * Keeps it compact: route, then one section per surface with JSON.
+ */
+export function formatAdminContextForPrompt(snapshot = getAdminContextSnapshot()): string {
+  const sections: string[] = [`Admin context (read-only — for answering factual questions):`]
+  sections.push(`Route: ${snapshot.route || "?"}`)
+  const names = Object.keys(snapshot.surfaces)
+  if (names.length === 0) {
+    sections.push(`Surfaces: (none registered)`)
+  } else {
+    for (const name of names) {
+      const json = safeJson(snapshot.surfaces[name])
+      sections.push(`Surface "${name}":\n${json}`)
+    }
+  }
+  return sections.join("\n\n")
+}
+
+function safeJson(value: unknown): string {
+  try {
+    const text = JSON.stringify(value, null, 2)
+    if (text.length > 4000) return text.slice(0, 4000) + "\n…(truncated)"
+    return text
+  } catch {
+    return "(unserializable)"
+  }
+}
+
+/** Hook: publish a surface while the component is mounted. */
+export function useRegisterAdminContext(name: string, data: Surface): void {
+  useEffect(() => {
+    publishAdminSurface(name, data)
+    return () => clearAdminSurface(name)
+  }, [name, data])
+}
--- a/app/lib/admin-tools.ts
+++ b/app/lib/admin-tools.ts
@@ -0,0 +1,161 @@
+// Curated tool surface the assistant can call. The LLM emits a fenced
+// ```tool block with one JSON object per line; we parse, execute via
+// arcadia-client, and feed results back as the next user turn.
+//
+// Each tool is a named function with documented args. The LLM never sees
+// raw HTTP — only the menu below.
+
+import type { ArcadiaClient } from "@crema/arcadia-client"
+import type { Tool, ToolCall as LLMToolCall } from "@crema/llm-ui"
+
+import { getTenant, listTenants, type Tenant } from "~/lib/arcadia/tenants"
+
+export type ToolCall = {
+  name: string
+  args: Record<string, unknown>
+}
+
+export type ToolResult = {
+  name: string
+  args: Record<string, unknown>
+  ok: boolean
+  data?: unknown
+  error?: string
+}
+
+type ToolDef = {
+  name: string
+  description: string
+  parameters: Record<string, unknown> // JSON Schema for OpenAI tool calling
+  isWrite: boolean
+  run: (args: Record<string, unknown>, ctx: ToolCtx) => Promise<unknown>
+}
+
+type ToolCtx = { arcadia: ArcadiaClient }
+
+const TOOLS: ToolDef[] = [
+  {
+    name: "list_tenants",
+    description:
+      "List every tenant on this arcadia deployment. Returns id, slug, name, status, plan, inserted_at. Call this for any question about tenant counts, statuses, or which tenants exist.",
+    parameters: {
+      type: "object",
+      properties: {},
+      additionalProperties: false,
+    },
+    isWrite: false,
+    run: async (_args, { arcadia }) => {
+      const tenants = await listTenants(arcadia)
+      return tenants.map(summarize)
+    },
+  },
+  {
+    name: "get_tenant",
+    description:
+      "Fetch a single tenant by slug (preferred) or id. Returns the tenant summary or null if not found.",
+    parameters: {
+      type: "object",
+      properties: {
+        slug: { type: "string", description: "The tenant's slug (e.g. 'acme', 'platform-admin')." },
+        id: { type: "string", description: "The tenant's UUID. Use only when the slug is unknown." },
+      },
+      additionalProperties: false,
+    },
+    isWrite: false,
+    run: async (args, { arcadia }) => {
+      const slug = typeof args.slug === "string" ? args.slug : null
+      const id = typeof args.id === "string" ? args.id : null
+      if (!slug && !id) throw new Error("get_tenant requires { slug } or { id }")
+      if (id) {
+        try {
+          return summarize(await getTenant(arcadia, id))
+        } catch {
+          return null
+        }
+      }
+      const tenants = await listTenants(arcadia)
+      const found = tenants.find((t) => t.slug === slug)
+      return found ? summarize(found) : null
+    },
+  },
+]
+
+/** OpenAI-format tool list to pass into ChatRequest.tools. */
+export function getOpenAITools(): Tool[] {
+  return TOOLS.map((t) => ({
+    name: t.name,
+    description: t.description,
+    parameters: t.parameters,
+  }))
+}
+
+function summarize(t: Tenant) {
+  return {
+    id: t.id,
+    slug: t.slug,
+    name: t.name,
+    status: t.status,
+    plan: t.plan?.name ?? null,
+    inserted_at: t.inserted_at,
+  }
+}
+
+const TOOL_BY_NAME = new Map(TOOLS.map((t) => [t.name, t]))
+
+function safeJson(value: unknown): string {
+  try {
+    const text = JSON.stringify(value, null, 2)
+    if (text.length > 6000) return text.slice(0, 6000) + "\n…(truncated)"
+    return text
+  } catch {
+    return "(unserializable)"
+  }
+}
+
+/** Run a list of provider-native tool calls and return `tool` role messages
+ *  ready to push back into useChat history. */
+export async function runLLMToolCalls(
+  calls: LLMToolCall[],
+  ctx: ToolCtx,
+  opts: { allowWrites?: boolean } = {},
+): Promise<{
+  results: ToolResult[]
+  toolMessages: { role: "tool"; content: string; toolCallId: string; name: string }[]
+}> {
+  const results: ToolResult[] = []
+  const toolMessages: { role: "tool"; content: string; toolCallId: string; name: string }[] = []
+  for (const call of calls) {
+    const def = TOOL_BY_NAME.get(call.name)
+    let parsed: Record<string, unknown> = {}
+    try {
+      parsed = call.arguments ? (JSON.parse(call.arguments) as Record<string, unknown>) : {}
+    } catch {
+      const err = `Could not parse arguments JSON: ${call.arguments}`
+      results.push({ name: call.name, args: {}, ok: false, error: err })
+      toolMessages.push({ role: "tool", content: JSON.stringify({ error: err }), toolCallId: call.id, name: call.name })
+      continue
+    }
+    if (!def) {
+      const err = `Unknown tool: ${call.name}`
+      results.push({ name: call.name, args: parsed, ok: false, error: err })
+      toolMessages.push({ role: "tool", content: JSON.stringify({ error: err }), toolCallId: call.id, name: call.name })
+      continue
+    }
+    if (def.isWrite && !opts.allowWrites) {
+      const err = "Write tools require user confirmation."
+      results.push({ name: call.name, args: parsed, ok: false, error: err })
+      toolMessages.push({ role: "tool", content: JSON.stringify({ error: err }), toolCallId: call.id, name: call.name })
+      continue
+    }
+    try {
+      const data = await def.run(parsed, ctx)
+      results.push({ name: call.name, args: parsed, ok: true, data })
+      toolMessages.push({ role: "tool", content: safeJson(data), toolCallId: call.id, name: call.name })
+    } catch (err) {
+      const msg = err instanceof Error ? err.message : String(err)
+      results.push({ name: call.name, args: parsed, ok: false, error: msg })
+      toolMessages.push({ role: "tool", content: JSON.stringify({ error: msg }), toolCallId: call.id, name: call.name })
+    }
+  }
+  return { results, toolMessages }
+}
--- a/app/lib/agents.ts
+++ b/app/lib/agents.ts
@@ -13,39 +13,39 @@ export type Agent = {

 export const DEFAULT_AGENTS: Agent[] = [
  {
-    id: "generalist",
+    id: "operator",
    name: "Atlas",
-    role: "Generalist",
+    role: "Platform Operator",
    prompt:
-      "You handle anything: chat, planning, summaries, casual questions. Match the user's tone. Keep replies as long as the task deserves — terse for quick questions, detailed when explaining.",
+      "You're the platform admin's day-to-day operator inside Arcadia Admin. Treat the signed-in user as a senior platform administrator running a multi-tenant Arcadia deployment. Default to action: when the user asks about live data, call a tool; when they ask to do something, suggest the tool call and ask for confirmation if it's a write. Prefer tenant slugs over UUIDs in conversation. Keep replies tight — operators read fast.",
  },
  {
-    id: "coder",
-    name: "Forge",
-    role: "Software engineer",
+    id: "auditor",
+    name: "Ledger",
+    role: "Auditor",
    prompt:
-      "You are a senior software engineer. Write idiomatic, well-typed code. Prefer concrete examples over abstract advice. When asked to fix a bug, identify root cause before patching. Use markdown code blocks with language tags. Mention edge cases briefly when relevant.",
+      "You're an audit-focused assistant inside Arcadia Admin. Specialise in audit logs, access reviews, and 'who did what when' questions. Always cite the actor_type (user / platform_admin / api_key / system) and timestamp when summarising audit entries. Be cautious about claims you can't back with a tool result — call a tool first.",
  },
  {
-    id: "writer",
-    name: "Inkwell",
-    role: "Writer",
+    id: "triage",
+    name: "Beacon",
+    role: "Incident Triage",
    prompt:
-      "You are a prose writer. Produce vivid, well-paced text — short stories, copy, emails, essays. Vary sentence length. Show, don't tell. When the user asks for a draft, deliver the draft, not a description of it.",
+      "You're an incident-triage assistant inside Arcadia Admin. When the user reports a problem (a tenant member can't sign in, a billing call is 402'ing, a webhook is failing), walk the diagnostic tree: identify the tenant, check tenant status, check the user's roles, check the billing-config / api-metering / feature-flag overrides as relevant. Suggest impersonation only when it's the right escalation. Keep a clear hypothesis → check → result rhythm.",
  },
  {
-    id: "researcher",
-    name: "Pilot",
-    role: "Researcher",
+    id: "analyst",
+    name: "Tally",
+    role: "Platform Analyst",
    prompt:
-      "You are a careful researcher. Structure answers as: claim → evidence → caveat. Distinguish what is well-established from what is uncertain. Refuse to fabricate citations — if you don't know, say so.",
+      "You're an analyst inside Arcadia Admin. Answer numerical and aggregate questions across the platform: tenant counts by status, plan distribution, audit-log volume, growth. Always pull live data via tools — never guess from stale snapshots. Present findings in plain prose first, then a small table when the breakdown helps.",
  },
  {
    id: "ui-driver",
    name: "Cursor",
    role: "UI Operator",
    prompt:
-      "You specialize in driving this app's UI on the user's behalf. Prefer doing over explaining. When the user asks for an action, emit an action block immediately. When they ask a question about the app, answer concisely and offer to do it.",
+      "You specialise in driving Arcadia Admin's UI on the operator's behalf. Prefer doing over explaining. When the user asks for an action that maps to a UI element, emit an action block immediately (using `data-action` ids the host has documented). For data questions, prefer tool calls over UI navigation.",
  },
 ]

@@ -64,6 +64,14 @@ function isAgent(v: unknown): v is Agent {
  )
 }

+// Old Vibespace agent ids — used to auto-migrate operators stuck on the
+// generic defaults from before Arcadia Admin had its own personas.
+const LEGACY_AGENT_IDS = new Set(["generalist", "coder", "writer", "researcher"])
+
+function isLegacyDefaultSet(agents: Agent[]): boolean {
+  return agents.some((a) => LEGACY_AGENT_IDS.has(a.id))
+}
+
 function readFromStorage(): Agent[] {
  if (typeof window === "undefined") return DEFAULT_AGENTS
  try {
@@ -72,7 +80,14 @@ function readFromStorage(): Agent[] {
    const parsed = JSON.parse(raw)
    if (!Array.isArray(parsed)) return DEFAULT_AGENTS
    const cleaned = parsed.filter(isAgent)
-    return cleaned.length > 0 ? cleaned : DEFAULT_AGENTS
+    if (cleaned.length === 0) return DEFAULT_AGENTS
+    if (isLegacyDefaultSet(cleaned)) {
+      // Auto-migrate: stored set still contains pre-arcadia personas.
+      localStorage.setItem(STORAGE_KEY, JSON.stringify(DEFAULT_AGENTS))
+      localStorage.removeItem(ACTIVE_KEY)
+      return DEFAULT_AGENTS
+    }
+    return cleaned
  } catch {
    return DEFAULT_AGENTS
  }
--- a/app/lib/arcadia-knowledge.ts
+++ b/app/lib/arcadia-knowledge.ts
@@ -0,0 +1,35 @@
+// Domain primer baked into the assistant's system prompt so it understands
+// what arcadia-app is, what platform admins do, and how the data model fits
+// together. Keep this tight — it costs context tokens on every turn.
+
+export const ARCADIA_KNOWLEDGE = `Arcadia (the backend you administer):
+
+Arcadia is a multi-tenant SaaS backend (Elixir/Phoenix umbrella, OpenAPI at /api/v1, server-rendered platform UI at /platform/*). This admin app (Arcadia Admin) is one of several clients — it talks to Arcadia over JSON, scoped by an X-Tenant-ID header and a Bearer JWT.
+
+Core entities and how they relate:
+
+- **Tenant** — an isolated workspace (a customer org). Identified by a slug (e.g. "acme", "platform-admin", "default") and a UUID id. Owns its own users, roles, billing config, branding, settings. Most data is tenant-scoped.
+- **Platform admin** — a separate identity that lives in the platform_admins table, NOT in any tenant. The signed-in operator using this app is one. Can read/write across all tenants. The first one is bootstrapped via /setup; \`is_root: true\` flags the original.
+- **User** — a member of a single tenant. Has email + password (or SSO), system roles (\`admin\` / \`user\` / \`viewer\`) plus optional custom roles. Login goes through POST /api/v1/auth/login with the tenant slug in X-Tenant-ID.
+- **Role** — permission bundle scoped to a tenant. \`admin\` / \`user\` / \`viewer\` are seeded as system roles per tenant. Permissions are wildcard-ish strings (e.g. \`tenants:read\`, \`*\`).
+- **Plan** — subscription tier attached to a tenant: name + limits (seats, storage, API quota). Drives billing.
+- **Audit log entry** — append-only record of who did what. \`actor_type\` is one of: \`user\`, \`platform_admin\`, \`api_key\`, \`system\`. Per-tenant and platform-wide entries coexist.
+- **Feature flag** — boolean / variant gate. Platform-wide default + per-tenant override.
+- **Storage / billing config / SSO IdP / inbound webhook / API quota / data retention policy / approval workflow / announcement** — per-tenant or platform-level configurations the operator can manage.
+
+Tenant lifecycle (status field):
+
+- **active** — normal operation. Members can sign in. Default state.
+- **suspended** — members blocked from signing in. Reversible: activate to restore. Use for temporary holds (overdue invoice, abuse investigation).
+- **deactivated** — stronger stop. Treat as effectively closed; usually flagged as terminal even if technically reversible. Use only when offboarding.
+
+Things to keep in mind when assisting:
+
+- Prefer tenant **slugs** in user-facing language ("the acme tenant"); slugs are stable, ids are UUIDs that aren't useful to humans.
+- "Platform admin" ≠ "admin role inside a tenant". The first acts cross-tenant; the second is scoped to one tenant.
+- Writes are auditable. Suggest the user double-check tenant slug and impact before suspend/deactivate. Deactivate is harsher than suspend — only use when clearly intended.
+- The operator can impersonate tenant users for debugging (POST /api/v1/admin/impersonate/:user_id) — surface this when they ask "why can't user X log in".
+- Quotas / rate cards / billing config errors usually surface as 402/403 from /api/v1 endpoints — diagnose by checking the tenant's billing-config and api-metering quotas.
+- The reference Phoenix app lives at \`reference/arcadia-app/\` in the workspace; its OpenAPI spec is at /api/openapi (sync via \`node ../lib-arcadia-client/scripts/sync-spec.mjs\`).
+
+When the user asks something that maps to a tool, call it. When they ask about a concept, explain it from this primer in plain language. When they ask to do something destructive, summarise the impact in one sentence and ask for confirmation before suggesting a tool call.`
--- a/app/lib/threads.ts
+++ b/app/lib/threads.ts
@@ -4,10 +4,16 @@
 import { useEffect, useSyncExternalStore } from "react"

 export type ThreadMessage = {
-  role: "user" | "assistant"
+  role: "user" | "assistant" | "tool"
  content: string
  /** Persona that authored this assistant message (omitted for user msgs). */
  agentId?: string
+  /** Native tool calls attached to an assistant message. */
+  toolCalls?: { id: string; name: string; arguments: string }[]
+  /** Tool role only — id of the matching assistant tool_call. */
+  toolCallId?: string
+  /** Tool role only — function name. */
+  name?: string
 }

 export type Thread = {