ai: per-config reasoning_effort + composer THINK chip

Two layers for thinking-mode control: 1. Per-config default (Settings → LLM) New "Reasoning effort" Select in the Add/Edit dialog with off/low/medium/high/max + a budget hint per option (~2k, ~8k, ~24k, ~64k thinking tokens). Saved row meta line surfaces the level inline so it's visible without opening the editor. 2. Per-message override (composer chip) New ReasoningChip next to the model picker. Click cycles through the same five levels. Hidden chrome when off (muted "think" pill); sodium-amber active style with the level label when set. Persisted to crema.ai.reasoning so a refresh keeps the operator's intent, wiped together with the conversation on Clear. When sending, withReasoning() merges reasoning_effort into the request body as a top-level field. The proxy forwards it untouched to OpenAI / DeepSeek (native field) and translates to Anthropic's thinking block server-side. reasoningEffortRef sidesteps a useCallback ordering issue — regenerateLast/continueLast are declared before the state hook, so they read the ref instead of a stale closure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 20:15:13 +10:00
parent 20494d1620
commit c379ebc37a
3 changed files with 179 additions and 4 deletions
--- a/app/lib/arcadia/llm-configs.ts
+++ b/app/lib/arcadia/llm-configs.ts
@@ -12,6 +12,20 @@ import type { ArcadiaClient } from "@crema/arcadia-client"

 export type LlmProvider = "openai" | "anthropic" | "deepseek" | "qwen" | "lmstudio"

+/**
+ * Reasoning effort. Sent verbatim to OpenAI / DeepSeek (which take
+ * `reasoning_effort` natively). Translated server-side into Anthropic's
+ * thinking block. `off` (or null) skips the field entirely.
+ */
+export type ReasoningEffort = "off" | "low" | "medium" | "high" | "max"
+export const REASONING_EFFORTS: ReasoningEffort[] = [
+  "off",
+  "low",
+  "medium",
+  "high",
+  "max",
+]
+
 export interface LlmConfiguration {
  id: string
  tenant_id: string | null
@@ -23,6 +37,7 @@ export interface LlmConfiguration {
  input_cost_per_million: number | null
  output_cost_per_million: number | null
  enabled: boolean
+  reasoning_effort: ReasoningEffort | null
  metadata: Record<string, unknown>
  inserted_at: string
  updated_at: string
@@ -39,6 +54,7 @@ export interface LlmConfigurationInput {
  input_cost_per_million?: number | null
  output_cost_per_million?: number | null
  enabled?: boolean
+  reasoning_effort?: ReasoningEffort | null
  metadata?: Record<string, unknown>
 }