Files
jules 286e2daf95 feat: initial commit — crema-app-aifirst-template
Hybrid traditional + AI-first webapp scaffold. Sibling to crema-app-template,
adds the AI assistant surface, command bus, scripts dialog, and virtual
cursor.

What's pre-wired:
- 6 routes: Overview, Resources, Activity, Assistant, Library, Settings
- Collapsible rail + appbar + avatar dropdown shell (template code, not a lib)
- Mobile sheet at <md
- /assistant: streaming chat via @crema/llm-ui, mock fallback, model selector,
  token meter, retry probe, stop-while-streaming, persistent UI Control toggle
- /settings: editable LM Studio endpoint + context window + response cap, with
  test-connection button
- Markdown rendering for assistant replies; ```action``` blocks rendered as a
  small "Ran N actions" pill
- ⌘⇧P script runner dialog + Play icon in the appbar
- Two demo scripts in public/scripts/
- mightypix theme as default, scoped via <AppShell theme="mightypix">

Libs wired in tsconfig + app.css:
- @crema/action-bus (the bus, parser, runner, cursor, provider, ws, llm-bridge)
- @crema/llm-ui, @crema/chat-ui, @crema/aifirst-ui, @crema/notification-ui
- lib-theme-mightypix

Docs:
- README.md — pitch + quick start + structure
- docs/AI_FIRST.md — full system tour (data-action contract, bus, DSL, scripts,
  cursor, LLM integration)
- app/components/layout/THEME_CONTRACT.md — every CSS variable a theme must declare
- CLAUDE.md — orientation for an LLM working in the repo

Genericized from comfy-cloud (the original prototype):
- Brand defaults to "App" / Sparkles icon (override via app/lib/identity.ts)
- User defaults to a stub (swap useUser() for real auth)
- localStorage namespace is "crema.*" (was "comfy.*")

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 18:31:46 +10:00

13 KiB
Raw Permalink Blame History

AI-first system tour

The "anything can drive the UI" architecture — the contract every interactive element opts into, the command bus that dispatches actions, the DSL that makes scripts and LLM output ergonomic, and the LLM integration that wires the whole thing into a chat surface.

It's one system, designed end-to-end. Reading top-to-bottom is the fastest way to understand it.

Overview

┌─ producers ─────────────────────┐    ┌─ executor ──────────────────┐
│  ⌨ console  (window.commandBus) │    │  command-bus.ts             │
│  📜 scripts (.script DSL)       │ ─▶ │  • dispatch(cmd)             │
│  🤖 LLM     (```action blocks)  │    │  • handlers map              │
│  🔌 WebSocket (optional)        │    │  • vars + history            │
└─────────────────────────────────┘    │  • listActions / readState   │
                                        └─────────┬───────────────────┘
                                                  │
                              ┌───────────────────┴───────────────────┐
                              │                                       │
                              ▼                                       ▼
                     ┌──────────────┐                       ┌──────────────────┐
                     │ DOM actions  │                       │ replay layer     │
                     │ (click/fill/ │                       │ • virtual cursor │
                     │  navigate/…) │                       │ • ripple on click│
                     └──────────────┘                       └──────────────────┘

Every interactive element opts in by adding data-action="<id>". That's the entire contract. From there, the bus can find it, scripts can target it, and the LLM can drive it.

The [data-action] convention

Every interactive UI element gets a stable id:

<button data-action="sidebar-toggle"></button>
<input data-action="appbar-search" />
<NavLink data-action="nav-resources" to="/resources">Resources</NavLink>

Naming: lowercase, kebab-case, prefixed by the surface it lives in:

Prefix Surface
nav-* Sidebar nav links (nav-overview, nav-resources, …)
nav-mobile-* Mobile sheet versions of nav
appbar-* Top bar controls (appbar-search, appbar-notifications)
avatar-* Avatar dropdown items
<route>-* Route-specific (assistant-clear, settings-save)
home-tile-*, run-script-*, etc. Domain-grouped

Why this works: the bus introspects the DOM at dispatch time — commandBus.listActions() returns every visible [data-action] in the page right now. New components are automatically scriptable as long as they tag their interactive slots. No central registry to maintain.

One gotcha: elements rendered in portals (closed dropdowns, sheets, dialogs) appear in the DOM but aren't visible. The listActions() filter uses Element.checkVisibility() + offsetParent + bbox checks to exclude those — so the LLM doesn't see actions it can't actually click.

The command bus

app/lib/command-bus.ts. Single dispatch point. Built-in handlers:

Command Args Purpose
navigate path React Router navigation
click target Find [data-action=target], scroll into view, click
fill target, value Set input value, fire input + change events
submit target Submit the form containing target
select target, value Set <select> value
wait ms Sleep
wait_for target, timeout? Poll for element existence (default 5s timeout)
scroll target? scrollIntoView, or page bottom if no target
read target? Return innerText (truncated to 4000 chars)
expect target, op, value? Assert: to_contain, to_be_visible, to_have_value
set name, value Set a variable for later interpolation

Variables

Every command can declare as: "name" — its return value gets stored under that name. Other commands reference it via $name interpolation.

commandBus.dispatch({ type: "read", target: "row-1", as: "id" })
commandBus.dispatch({ type: "click", target: "$id" })   // resolves at dispatch

In DSL form:

$id = read row-1
click $id

Custom handlers

Register your own anywhere:

import { commandBus } from "~/lib/command-bus"

commandBus.register("ring", async (cmd) => {
  const el = document.querySelector(`[data-action="${cmd.target}"]`)
  el?.classList.add("animate-ring")
  setTimeout(() => el?.classList.remove("animate-ring"), 1200)
})

// dispatch as:  { type: "ring", target: "save-button" }

Returns an unregister fn for cleanup.

Console API

The provider exposes window.commandBus, window.runScript, and window.runScriptText. Open devtools and try:

commandBus.listActions().filter(a => a.id.startsWith("nav-"))
commandBus.readState()                      // visible page text
commandBus.history                          // every dispatch + result/error
runScript("demo-tour")                      // load + run /scripts/demo-tour.script
runScriptText('click sidebar-toggle\nwait 500\nclick nav-assistant')

The DSL

Plain text, one command per line. Parses to canonical JSON. Both layers share the same vocabulary; the DSL is just sugar.

# Comfy Cloud — short tour through the rail
# speed: 0.9

click sidebar-toggle
wait 500

click nav-resources
wait_for nav-resources
wait 500

click nav-assistant
wait 700

# Variables
$id = read row-1
click $id

# Assertions
fill appbar-search "acme"
expect resources-table to_contain "Acme"

Syntax:

  • One command per line.
  • Whitespace-tolerant. Quote values with spaces ("hello world"). Backslash-escape inside quotes.
  • # starts a comment. The # speed: <n> directive at the top sets cursor animation speed (default 1).
  • $name = <command> assigns the command's return value to a variable.
  • $name in args is interpolated at dispatch.
  • run other-script includes another script (resolved as /scripts/other-script.script).

API:

import { parseScript, parseLine, stringifyScript } from "~/lib/command-parser"
import { runScript, runScriptText } from "~/lib/command-script"

const { options, commands } = parseScript(text)   // → JSON commands
const dsl = stringifyScript(commands, options)    // → text (round-trips)

await runScriptText(dsl)
await runScript("demo-tour")

Scripts

Live in public/scripts/*.script. Two ways to invoke at runtime:

  1. Dialog — appbar Play icon, or ⌘⇧P keyboard shortcut. Lists known scripts + a paste-DSL textarea. See app/components/scripts-dialog.tsx.
  2. ConsolerunScript("demo-tour") or runScriptText("…").

Sub-scripts compose:

# in a parent script
run setup-fixtures
run actual-test

Virtual cursor

app/lib/virtual-cursor.ts. A floating SVG cursor + ripple element appended to document.body. The bus's beforeCommand hook moves the cursor to the target element before each command and rips a ripple on click. Speed is controlled by the script's # speed: header (default 1; lower = slower animation).

aria-hidden/role="presentation" so screen readers ignore it. To run silently (e.g. tests), pass { silent: true } to dispatch() or run().

LLM integration

app/lib/llm-tools.ts. Two responsibilities:

1. Build the system prompt

buildSystemPrompt({ path, includeActions }) returns a string with:

  • A short preface ("You are the assistant in Comfy Cloud…")
  • A compact DSL reference (~120 tokens — kept tight for small context windows)
  • The current route
  • A live snapshot of every visible [data-action] on screen, formatted as - <id>: <label> (skipping invisible/portal items)

The Assistant route rebuilds this on every send when UI Control is on, so the model always sees the current page's actions.

2. Extract action blocks from streamed replies

The system prompt teaches the model to emit a fenced ```action block when the user asks it to do something. After each assistant turn ends, the Assistant route runs runActionBlocks(message.content) which:

  1. Regex-extracts every ```action ... ``` block.
  2. Feeds each through runScriptText → DSL parser → command bus.
  3. Returns { ran, errors } for the route to surface as a status pill ("Ran 2 actions").

The action-block fence renders as a small pill in the chat bubble (not raw text) — see app/components/assistant/message-body.tsx.

Format the model emits

I'll take you to resources.

```action
navigate /resources
wait_for nav-resources
```

Done — anything else?

The "rules" the system prompt teaches the model:

  • Only emit a block when asked to do something. Questions and chitchat reply normally.
  • Use only ids from the "Available actions" list.
  • A short sentence + the block. Optional follow-up after.
  • Never invent target ids.

Smaller models (≤7B) sometimes drift — the system prompt is explicit about each rule, but you'll see occasional invented ids or extra prose. Bigger models or Claude follow it near-perfectly.

Token budget

useChat with these reqExtras:

  • system — fresh per send via buildSystemPrompt(...)
  • messages — pre-trimmed via trimMessages(...) to fit contextTokens sysTokens responseBudget
  • maxTokens: responseBudget — caps the reply length

contextTokens, responseBudget, and baseURL come from /settings. Default 9000 / 512. The header bar shows a live <used> / <total> badge that turns amber when getting close.

Producer 4: WebSocket (optional, unwired)

app/lib/command-ws.ts. A reconnecting WebSocket listener that accepts three message shapes:

{ "id": "abc", "command": { "type": "click", "target": "save-button" } }
{ "id": "abc", "script":  [ {"type":"navigate","path":"/x"},  ] }
{ "id": "abc", "dsl":     "navigate /x\nclick save-button" }

Replies with { id, ok: true } or { id, ok: false, error: "…" }.

Not wired up by default. To enable for a session:

import { connectCommandSocket } from "~/lib/command-ws"
const sock = connectCommandSocket("ws://localhost:9229/ui")
// ... later ...
sock.close()

The intended use is screen-share / observer / CI scenarios where an external process drives the UI. Origin checks and an opt-in toggle in /settings are sketched but not built — don't auto-connect in production.

Safety (sketch, not built)

When the UI gains destructive surfaces, mark them with data-action-danger:

<Button data-action="delete-account" data-action-danger>Delete</Button>

The bus's beforeCommand hook can refuse danger-marked targets unless the caller passes a confirmation token. This is not implemented yet — flag and design when you have a real destructive action to gate.

Files at a glance

File Role
app/lib/command-bus.ts JSON layer, dispatch, handlers, vars, history, listActions, readState
app/lib/command-parser.ts DSL ↔ JSON
app/lib/command-script.ts Script runner — load /scripts/*.script, run with cursor speed
app/lib/virtual-cursor.ts Visible cursor + ripple
app/lib/command-provider.tsx React glue: registers navigate, mounts cursor, exposes window.*
app/lib/command-ws.ts Optional WebSocket producer
app/lib/llm-tools.ts buildSystemPrompt, extractActionBlocks, runActionBlocks, token utils
app/lib/llm-settings.ts Persisted base URL / context / response cap
app/components/scripts-dialog.tsx ⌘⇧P script runner UI
app/components/assistant/message-body.tsx Markdown bubble + action-block pill
public/scripts/demo-*.script Examples

Quick recipes

Make a new component scriptable — add data-action="<id>". Done.

Test a flow — write a .script file with expect assertions:

navigate /settings
fill settings-base-url "http://localhost:1234/v1"
click settings-test
wait 1500
expect settings-test to_contain "models available"

Run it via the dialog or runScript("…").

Drive a custom widget — register a handler:

commandBus.register("highlight", async (cmd) => {
  /* ...your logic... */
})

Then dispatch { type: "highlight", target: "…" } from anywhere.

Send the LLM a different system promptbuildSystemPrompt accepts a preface override:

buildSystemPrompt({
  preface: "You are the support agent for Comfy Cloud's billing team.",
  path: window.location.pathname,
})