ai: per-config reasoning_effort + composer THINK chip

Two layers for thinking-mode control:

1. Per-config default (Settings → LLM)
   New "Reasoning effort" Select in the Add/Edit dialog with
   off/low/medium/high/max + a budget hint per option (~2k, ~8k,
   ~24k, ~64k thinking tokens). Saved row meta line surfaces the
   level inline so it's visible without opening the editor.

2. Per-message override (composer chip)
   New ReasoningChip next to the model picker. Click cycles through
   the same five levels. Hidden chrome when off (muted "think" pill);
   sodium-amber active style with the level label when set.

   Persisted to crema.ai.reasoning so a refresh keeps the operator's
   intent, wiped together with the conversation on Clear.

When sending, withReasoning() merges reasoning_effort into the request
body as a top-level field. The proxy forwards it untouched to OpenAI /
DeepSeek (native field) and translates to Anthropic's thinking block
server-side.

reasoningEffortRef sidesteps a useCallback ordering issue —
regenerateLast/continueLast are declared before the state hook, so
they read the ref instead of a stale closure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
jules
2026-05-02 20:15:13 +10:00
parent 20494d1620
commit c379ebc37a
3 changed files with 179 additions and 4 deletions

View File

@@ -12,6 +12,20 @@ import type { ArcadiaClient } from "@crema/arcadia-client"
export type LlmProvider = "openai" | "anthropic" | "deepseek" | "qwen" | "lmstudio"
/**
* Reasoning effort. Sent verbatim to OpenAI / DeepSeek (which take
* `reasoning_effort` natively). Translated server-side into Anthropic's
* thinking block. `off` (or null) skips the field entirely.
*/
export type ReasoningEffort = "off" | "low" | "medium" | "high" | "max"
export const REASONING_EFFORTS: ReasoningEffort[] = [
"off",
"low",
"medium",
"high",
"max",
]
export interface LlmConfiguration {
id: string
tenant_id: string | null
@@ -23,6 +37,7 @@ export interface LlmConfiguration {
input_cost_per_million: number | null
output_cost_per_million: number | null
enabled: boolean
reasoning_effort: ReasoningEffort | null
metadata: Record<string, unknown>
inserted_at: string
updated_at: string
@@ -39,6 +54,7 @@ export interface LlmConfigurationInput {
input_cost_per_million?: number | null
output_cost_per_million?: number | null
enabled?: boolean
reasoning_effort?: ReasoningEffort | null
metadata?: Record<string, unknown>
}