Sessions & harnesses

This is the default way Cial runs the agent. Like everything in Cial, you can change it — a different model, a new harness, a custom workflow. See Extending Cial.

A session is one continuous conversation, and it is Cial's core loop. Each session is two durable layers: a chat thread (the ordered messages you see and search, private to you) and a harness session (the engine's own reasoning state, keyed by a stable id minted once and resumed every turn). Both persist to your instance, so the agent builds on earlier turns and you can close the tab and pick up days later exactly where you left off.

This page covers the turn loop, the two engines and their controls, connecting the account each one needs, and watching the background work a turn can spin off.

Sessions & turns

A turn is one prompt-and-response cycle. Your message rides a WebSocket to your server, which persists it, spawns the engine, and streams the engine's output back as normalized events — that's why you watch files read and commands run live instead of waiting for a final block.

Turn request flow
Composerprompt + attachmentsYour message plus any uploads; carries the session's harness/model/thinking/effort choices.
Serverpersist → prepare → spawnSaves your message first (durable before any work), resolves the engine + credential + enabled tools, then spawns the engine.
Live streamtext · tools · resultEach engine output line is normalized to one event and pushed to your browser over the socket.
Your message rides the socket to the server, which persists it, spawns the engine, and streams normalized events back.

Two properties of how the engine is launched matter in practice. It runs detached, with output streamed by tailing a per-turn log rather than held in memory — so an in-flight turn survives an in-place restart (e.g. a self-edit deploy) and re-attaches instead of dying. And because the streaming state is saved as it goes, a mid-turn page refresh reconnects you to live work, partial text and all, instead of a blank screen.

A session runs one turn at a time, enforced server-side — a session already running will refuse a second concurrent turn. It is always idle (finished, awaiting you) or running (streaming), readable from both the chat view and the sidebar.

Session lifecycle
Idleawaiting your promptThe resting state, and the only one a new turn can begin from.
Runningstreaming text · tools · thinkingActively working your latest message; shows live activity and an elapsed timer.
Result / cancelback to idleTurn end or a cancel returns the session to idle, ready for the next prompt.
One turn at a time. Sending moves the session to running; the result — or a cancel — returns it to idle.

Day-to-day mechanics:

  • Create — a session is created the moment you send the first message; nothing to configure in advance. The thread is written, the harness id is minted, and it appears in the sidebar.
  • Cancel — cancelling kills the engine's entire process group (the agent and anything it spawned stop at once) and returns the session to idle. It halts only the current turn; everything already streamed stays in your history. Cancel early — re-prompting beats waiting out a turn gone off course.
  • Resume — pick any session; the server replays its thread and your next turn resumes the same harness id, so the agent still has its context. Works across reloads and across days.
  • Rename / delete — both from a session's row actions in the sidebar. The name is just a label; deleting removes the history and can't be undone.

Keep one topic per session. The engine's context is anchored to a single harness session, so mixing unrelated work dilutes that context and clutters your history. For something new, start a new session — that's also how you switch engine or model, since both are bound at creation (see below).

A long turn doesn't need watching: Cial can send a Web Push notification when a turn finishes or the agent needs input, enabled per device under Settings → Notifications. See Notifications.

Harnesses & the model picker

Every turn runs on a harness — the agent engine. Cial ships two, selected per session via the composer's harness pill:

HarnessModelHow it's driven
Claude Code (default)Anthropic Claudenon-interactive claude emitting streaming JSON
Kimi CodeMoonshot K2.xJSON-RPC --wire mode, driven by a small bridge process

They aren't hardcoded branches: each is a provider behind one interface, resolved per turn through a registry. The spawn pipeline is engine-agnostic, and both CLIs normalize into one HarnessEvent stream — init, text, thinking, tool_use/tool_result, usage, result (plus background/approval/rate-limit events from Claude today). That single vocabulary is why the UI doesn't care which engine ran, and why a Kimi turn and a Claude turn look identical.

One pipeline, two engines
getProvider(session)registry → HarnessProviderThe runtime never names an engine; it resolves the session's provider and uses it uniformly. Adding an engine is additive — a provider module plus a registry row.
Claude Codeclaude -p · stream-jsonStreams JSON over stdout; each line maps to one normalized event.
Kimi Code--wire · bridge pumpA bridge owns the JSON-RPC wire (init → prompt → auto-approve) and forwards raw lines to the per-turn log a parser tails.
Shared event streamone HarnessEvent vocabularyBoth engines collapse to the same events, so the UI renders either identically.
The session's provider is resolved from a registry; the spawn, env builder, and event stream are shared.

Model picker. Within a harness, the composer's model picker sets the exact model; leave it alone for the engine's default. Claude Code starts on a capable default and lets you switch Claude models; Kimi offers a few K2 variants including a high-speed option. The chosen model passes straight through at spawn, and the init event reports the model the turn actually started on (Kimi names are config keys the provider maps — you never type a model id).

Thinking & effort. Two more composer dials feed the spawn:

  • Thinking — internal reasoning before answering. On by default; stronger on hard work, turn it off for a faster reply on simple asks.
  • Effort — how much work the agent puts into one turn. Higher = more reasoning time for involved tasks; lower keeps things snappy. Omitted when unset (engine default).

Engine options. Each provider exposes its own composer toggles, so you only see what the connected engine supports. Notably: Claude's ultracode prepends a keyword that runs a dynamic agentic workflow (kept alive well past the lead agent's reply, so a multi-hour Workflow isn't cut off — see background tasks); Kimi's plan / AFK modes (planning, and away-from-keyboard auto-approve).

Engine, model, thinking, and options are bound when a session begins — switching mid-conversation would orphan the harness session holding the agent's context. To try a different harness or model, start a fresh session. Earlier sessions keep their own settings.

To compare engines head-to-head, run the same prompt in two sessions, one per harness — both normalize to the same stream, so the only variable is the model.

Connecting your AI account

Each engine acts on your behalf, so each needs its own credential, managed under Settings → Harnesses. A connected credential is what makes its engine appear in the harness pill. Credentials are never baked into a request — they're injected into the agent's environment at spawn time, through a shared allowlisted env builder (clean PATH/HOME, Cial Agent git identity, no ambient leakage) every provider reuses.

Credential injection at spawn
Stored credentialSettings → HarnessesAnthropic OAuth or API key; Kimi API key. Connect only the engines you'll use.
Resolved per turnpicked for the session's providerWhichever credential the session's engine needs is selected at spawn time.
Injected into the engineAnthropic → env credential · Kimi → KIMI_API_KEY + per-turn configWritten into the allowlisted spawn env so the engine can authenticate for the turn.
A stored credential is resolved for the session's provider and injected into a clean, allowlisted spawn environment.
  • Claude (Anthropic). Settings → Harnesses → Anthropic, two mutually exclusive ways: Sign in (OAuth) — a PKCE paste-code flow, same as claude auth login; or API key — paste a key. Cial injects whichever you connected.
  • Kimi (Moonshot). Add your Kimi API key in the same area. It's written into a self-contained per-turn config and set as KIMI_API_KEY. Once saved, Kimi appears in the harness pill.

The Settings → Harnesses screen shows each engine's connection status — check it first if a session won't run on a given engine. Disconnect (or remove the key) there to stop using an account; new sessions on that engine can't run until it's reconnected.

Background & multi-agent tasks

A single turn can spin off work that keeps running on its own. The background activity tray follows that detached work even after the launching message is done:

  • Subagents — helper agents the main agent runs in parallel.
  • Workflows — multi-step jobs it orchestrates (e.g. an ultracode Workflow).
  • Background shells — long-running commands it leaves running.

Each session has its own expandable tray, appearing under the session when there's background work. Tasks show Running or Completed and update on their own — because tracking is turn-independent, a job still running when the reply ends flips to completed without you doing anything, and the list survives a refresh or restart. Open a task for its live detail view.

The tray fills in for Claude Code sessions — it's the engine that reports this telemetry today; the tray stays empty for Kimi. It's on by default; toggle background-activity under Settings → Discovery to hide or show it. Hiding only changes the view — the agent's background work runs the same either way. Pair it with push notifications to be pinged when work finishes.