Hybrid traditional + AI-first webapp scaffold. Sibling to crema-app-template, adds the AI assistant surface, command bus, scripts dialog, and virtual cursor. What's pre-wired: - 6 routes: Overview, Resources, Activity, Assistant, Library, Settings - Collapsible rail + appbar + avatar dropdown shell (template code, not a lib) - Mobile sheet at <md - /assistant: streaming chat via @crema/llm-ui, mock fallback, model selector, token meter, retry probe, stop-while-streaming, persistent UI Control toggle - /settings: editable LM Studio endpoint + context window + response cap, with test-connection button - Markdown rendering for assistant replies; ```action``` blocks rendered as a small "Ran N actions" pill - ⌘⇧P script runner dialog + Play icon in the appbar - Two demo scripts in public/scripts/ - mightypix theme as default, scoped via <AppShell theme="mightypix"> Libs wired in tsconfig + app.css: - @crema/action-bus (the bus, parser, runner, cursor, provider, ws, llm-bridge) - @crema/llm-ui, @crema/chat-ui, @crema/aifirst-ui, @crema/notification-ui - lib-theme-mightypix Docs: - README.md — pitch + quick start + structure - docs/AI_FIRST.md — full system tour (data-action contract, bus, DSL, scripts, cursor, LLM integration) - app/components/layout/THEME_CONTRACT.md — every CSS variable a theme must declare - CLAUDE.md — orientation for an LLM working in the repo Genericized from comfy-cloud (the original prototype): - Brand defaults to "App" / Sparkles icon (override via app/lib/identity.ts) - User defaults to a stub (swap useUser() for real auth) - localStorage namespace is "crema.*" (was "comfy.*") Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 KiB
AI-first system tour
The "anything can drive the UI" architecture — the contract every interactive element opts into, the command bus that dispatches actions, the DSL that makes scripts and LLM output ergonomic, and the LLM integration that wires the whole thing into a chat surface.
It's one system, designed end-to-end. Reading top-to-bottom is the fastest way to understand it.
Overview
┌─ producers ─────────────────────┐ ┌─ executor ──────────────────┐
│ ⌨ console (window.commandBus) │ │ command-bus.ts │
│ 📜 scripts (.script DSL) │ ─▶ │ • dispatch(cmd) │
│ 🤖 LLM (```action blocks) │ │ • handlers map │
│ 🔌 WebSocket (optional) │ │ • vars + history │
└─────────────────────────────────┘ │ • listActions / readState │
└─────────┬───────────────────┘
│
┌───────────────────┴───────────────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────────┐
│ DOM actions │ │ replay layer │
│ (click/fill/ │ │ • virtual cursor │
│ navigate/…) │ │ • ripple on click│
└──────────────┘ └──────────────────┘
Every interactive element opts in by adding data-action="<id>". That's the
entire contract. From there, the bus can find it, scripts can target it, and
the LLM can drive it.
The [data-action] convention
Every interactive UI element gets a stable id:
<button data-action="sidebar-toggle">…</button>
<input data-action="appbar-search" />
<NavLink data-action="nav-resources" to="/resources">Resources</NavLink>
Naming: lowercase, kebab-case, prefixed by the surface it lives in:
| Prefix | Surface |
|---|---|
nav-* |
Sidebar nav links (nav-overview, nav-resources, …) |
nav-mobile-* |
Mobile sheet versions of nav |
appbar-* |
Top bar controls (appbar-search, appbar-notifications) |
avatar-* |
Avatar dropdown items |
<route>-* |
Route-specific (assistant-clear, settings-save) |
home-tile-*, run-script-*, etc. |
Domain-grouped |
Why this works: the bus introspects the DOM at dispatch time —
commandBus.listActions() returns every visible [data-action] in the page
right now. New components are automatically scriptable as long as they tag
their interactive slots. No central registry to maintain.
One gotcha: elements rendered in portals (closed dropdowns, sheets,
dialogs) appear in the DOM but aren't visible. The listActions() filter
uses Element.checkVisibility() + offsetParent + bbox checks to exclude
those — so the LLM doesn't see actions it can't actually click.
The command bus
app/lib/command-bus.ts. Single dispatch point. Built-in handlers:
| Command | Args | Purpose |
|---|---|---|
navigate |
path |
React Router navigation |
click |
target |
Find [data-action=target], scroll into view, click |
fill |
target, value |
Set input value, fire input + change events |
submit |
target |
Submit the form containing target |
select |
target, value |
Set <select> value |
wait |
ms |
Sleep |
wait_for |
target, timeout? |
Poll for element existence (default 5s timeout) |
scroll |
target? |
scrollIntoView, or page bottom if no target |
read |
target? |
Return innerText (truncated to 4000 chars) |
expect |
target, op, value? |
Assert: to_contain, to_be_visible, to_have_value |
set |
name, value |
Set a variable for later interpolation |
Variables
Every command can declare as: "name" — its return value gets stored under
that name. Other commands reference it via $name interpolation.
commandBus.dispatch({ type: "read", target: "row-1", as: "id" })
commandBus.dispatch({ type: "click", target: "$id" }) // resolves at dispatch
In DSL form:
$id = read row-1
click $id
Custom handlers
Register your own anywhere:
import { commandBus } from "~/lib/command-bus"
commandBus.register("ring", async (cmd) => {
const el = document.querySelector(`[data-action="${cmd.target}"]`)
el?.classList.add("animate-ring")
setTimeout(() => el?.classList.remove("animate-ring"), 1200)
})
// dispatch as: { type: "ring", target: "save-button" }
Returns an unregister fn for cleanup.
Console API
The provider exposes window.commandBus, window.runScript, and
window.runScriptText. Open devtools and try:
commandBus.listActions().filter(a => a.id.startsWith("nav-"))
commandBus.readState() // visible page text
commandBus.history // every dispatch + result/error
runScript("demo-tour") // load + run /scripts/demo-tour.script
runScriptText('click sidebar-toggle\nwait 500\nclick nav-assistant')
The DSL
Plain text, one command per line. Parses to canonical JSON. Both layers share the same vocabulary; the DSL is just sugar.
# Comfy Cloud — short tour through the rail
# speed: 0.9
click sidebar-toggle
wait 500
click nav-resources
wait_for nav-resources
wait 500
click nav-assistant
wait 700
# Variables
$id = read row-1
click $id
# Assertions
fill appbar-search "acme"
expect resources-table to_contain "Acme"
Syntax:
- One command per line.
- Whitespace-tolerant. Quote values with spaces (
"hello world"). Backslash-escape inside quotes. #starts a comment. The# speed: <n>directive at the top sets cursor animation speed (default 1).$name = <command>assigns the command's return value to a variable.$namein args is interpolated at dispatch.run other-scriptincludes another script (resolved as/scripts/other-script.script).
API:
import { parseScript, parseLine, stringifyScript } from "~/lib/command-parser"
import { runScript, runScriptText } from "~/lib/command-script"
const { options, commands } = parseScript(text) // → JSON commands
const dsl = stringifyScript(commands, options) // → text (round-trips)
await runScriptText(dsl)
await runScript("demo-tour")
Scripts
Live in public/scripts/*.script. Two ways to invoke at runtime:
- Dialog — appbar Play icon, or ⌘⇧P keyboard shortcut. Lists known
scripts + a paste-DSL textarea. See
app/components/scripts-dialog.tsx. - Console —
runScript("demo-tour")orrunScriptText("…").
Sub-scripts compose:
# in a parent script
run setup-fixtures
run actual-test
Virtual cursor
app/lib/virtual-cursor.ts. A floating SVG cursor + ripple element appended
to document.body. The bus's beforeCommand hook moves the cursor to the
target element before each command and rips a ripple on click. Speed is
controlled by the script's # speed: header (default 1; lower = slower
animation).
aria-hidden/role="presentation" so screen readers ignore it. To run
silently (e.g. tests), pass { silent: true } to dispatch() or run().
LLM integration
app/lib/llm-tools.ts. Two responsibilities:
1. Build the system prompt
buildSystemPrompt({ path, includeActions }) returns a string with:
- A short preface ("You are the assistant in Comfy Cloud…")
- A compact DSL reference (~120 tokens — kept tight for small context windows)
- The current route
- A live snapshot of every visible
[data-action]on screen, formatted as- <id>: <label>(skipping invisible/portal items)
The Assistant route rebuilds this on every send when UI Control is on, so the model always sees the current page's actions.
2. Extract action blocks from streamed replies
The system prompt teaches the model to emit a fenced ```action block
when the user asks it to do something. After each assistant turn ends, the
Assistant route runs runActionBlocks(message.content) which:
- Regex-extracts every
```action ... ```block. - Feeds each through
runScriptText→ DSL parser → command bus. - Returns
{ ran, errors }for the route to surface as a status pill ("Ran 2 actions").
The action-block fence renders as a small pill in the chat bubble (not raw
text) — see app/components/assistant/message-body.tsx.
Format the model emits
I'll take you to resources.
```action
navigate /resources
wait_for nav-resources
```
Done — anything else?
The "rules" the system prompt teaches the model:
- Only emit a block when asked to do something. Questions and chitchat reply normally.
- Use only ids from the "Available actions" list.
- A short sentence + the block. Optional follow-up after.
- Never invent target ids.
Smaller models (≤7B) sometimes drift — the system prompt is explicit about each rule, but you'll see occasional invented ids or extra prose. Bigger models or Claude follow it near-perfectly.
Token budget
useChat with these reqExtras:
system— fresh per send viabuildSystemPrompt(...)messages— pre-trimmed viatrimMessages(...)to fitcontextTokens − sysTokens − responseBudgetmaxTokens: responseBudget— caps the reply length
contextTokens, responseBudget, and baseURL come from /settings.
Default 9000 / 512. The header bar shows a live <used> / <total> badge
that turns amber when getting close.
Producer 4: WebSocket (optional, unwired)
app/lib/command-ws.ts. A reconnecting WebSocket listener that accepts
three message shapes:
{ "id": "abc", "command": { "type": "click", "target": "save-button" } }
{ "id": "abc", "script": [ {"type":"navigate","path":"/x"}, … ] }
{ "id": "abc", "dsl": "navigate /x\nclick save-button" }
Replies with { id, ok: true } or { id, ok: false, error: "…" }.
Not wired up by default. To enable for a session:
import { connectCommandSocket } from "~/lib/command-ws"
const sock = connectCommandSocket("ws://localhost:9229/ui")
// ... later ...
sock.close()
The intended use is screen-share / observer / CI scenarios where an external process drives the UI. Origin checks and an opt-in toggle in /settings are sketched but not built — don't auto-connect in production.
Safety (sketch, not built)
When the UI gains destructive surfaces, mark them with
data-action-danger:
<Button data-action="delete-account" data-action-danger>Delete</Button>
The bus's beforeCommand hook can refuse danger-marked targets unless the
caller passes a confirmation token. This is not implemented yet — flag
and design when you have a real destructive action to gate.
Files at a glance
| File | Role |
|---|---|
app/lib/command-bus.ts |
JSON layer, dispatch, handlers, vars, history, listActions, readState |
app/lib/command-parser.ts |
DSL ↔ JSON |
app/lib/command-script.ts |
Script runner — load /scripts/*.script, run with cursor speed |
app/lib/virtual-cursor.ts |
Visible cursor + ripple |
app/lib/command-provider.tsx |
React glue: registers navigate, mounts cursor, exposes window.* |
app/lib/command-ws.ts |
Optional WebSocket producer |
app/lib/llm-tools.ts |
buildSystemPrompt, extractActionBlocks, runActionBlocks, token utils |
app/lib/llm-settings.ts |
Persisted base URL / context / response cap |
app/components/scripts-dialog.tsx |
⌘⇧P script runner UI |
app/components/assistant/message-body.tsx |
Markdown bubble + action-block pill |
public/scripts/demo-*.script |
Examples |
Quick recipes
Make a new component scriptable — add data-action="<id>". Done.
Test a flow — write a .script file with expect assertions:
navigate /settings
fill settings-base-url "http://localhost:1234/v1"
click settings-test
wait 1500
expect settings-test to_contain "models available"
Run it via the dialog or runScript("…").
Drive a custom widget — register a handler:
commandBus.register("highlight", async (cmd) => {
/* ...your logic... */
})
Then dispatch { type: "highlight", target: "…" } from anywhere.
Send the LLM a different system prompt — buildSystemPrompt accepts a
preface override:
buildSystemPrompt({
preface: "You are the support agent for Comfy Cloud's billing team.",
path: window.location.pathname,
})