v2.4.1: A New Streaming Chat Mode for a Smoother, More Reliable Experience

Share
v2.4.1: A New Streaming Chat Mode for a Smoother, More Reliable Experience

There is a specific kind of small betrayal that breaks trust with software faster than almost anything else: you have a long, useful conversation with your AI employee, you refresh the page, and it is gone. v2.4.1 is the release that kills that bug for good, along with a handful of others that were quietly making chat feel less solid than it should. This one is about the chat experience finally being dependable on every instance, not just the dev box.

Your conversation now survives a refresh, on Claude and Codex

This is the headline. Chat-mode Claude and Codex run headless behind a bridge that streams tokens, thinking, and tool activity into the chat in real time. The live experience was fine. The problem was persistence: history was funneled into a shared side-store that worked on long-lived development containers but came back empty on a freshly provisioned instance. Refresh the page on a new deployment and your assistant's replies had vanished.

We fixed it by making persistence follow the engine instead of the transport. Each engine now reads from its own native transcript, the file it writes itself, which is the actual source of truth.

  • Claude reads its native transcript in the employee's history/chat/claude/ folder. That folder is always mounted, so it exists on any instance. Refresh restores the text, the tool calls, and the thinking.
  • Codex reads its native rollout in history/chat/codex/. Because Codex names each rollout by its own internal id, the bridge keeps a small pointer so the right rollout is found on reconnect, then replays the whole conversation.

The result is the same conversation after a reload, with no parallel bookkeeping drifting out of sync behind the scenes. Reopen the tab, refresh, redeploy: the history is there.

Codex stops answering twice

If you run Codex, you may have noticed replies showing up doubled. The cause was two independent paths finalizing the same turn at once: the streaming bridge and a legacy transcript poll-watcher. Both rendered the answer, so you saw it twice.

In streaming mode the poll-watcher is now switched off. The bridge is the single source for live output, and the native replay handles refresh. One answer, once.

Codex thinking sticks around

Codex's readable reasoning lives in its rollout's reasoning events. Previously the collapsible "Thought for Xs" box disappeared the moment you reloaded. The refresh path now extracts that reasoning and re-attaches it to the matching assistant turn, so the thinking box comes back after a reload, exactly like it does for Claude.

Stop means stop, not disconnect

The Stop button had a nasty side effect. In bridge mode it terminated the underlying CLI process, which ended the entire session. You meant to cut one runaway answer and instead you killed the whole conversation, and the next prompt got no response.

Stop now sends a proper interrupt to the running turn. The current answer is cut, the spinner clears, and the session stays alive and ready for your next message. It does the one thing the button always implied it did.

Connecting a Claude account, rebuilt from scratch

This one is worth dwelling on, because the old flow was genuinely fragile. Connecting or renewing a workspace Claude account used to spin up a throwaway container and scrape the OAuth login URL off a rendered terminal. That broke the moment the long URL wrapped across two lines, which it often did.

The new flow drives Claude's own setup-token process directly. The OAuth URL is captured cleanly and shown in the UI, you authorize and paste the returned code, and that is it. No transient container, no terminal parsing, no guessing where the URL ended.

There is an important detail in what gets stored. The result is a long-lived subscription token, kept encrypted at rest with AES-256-GCM in the same secret store as every other credential, and injected into employees as CLAUDE_CODE_OAUTH_TOKEN. This authenticates against your Claude Pro or Max subscription. It is not an ANTHROPIC_API_KEY, and it is not billed per prompt. If you are on a plan you already pay for, your employees use it directly rather than racking up metered API charges on the side.

The small print

Persistence is now per engine: Claude reads from history/chat/claude/, Codex from history/chat/codex/, and harness-based employees (Pi and custom LLMs) from history/chat/harness/, which is now created and mounted so it persists on fresh instances too. No database migration is required for any of this.

None of these are glamorous features. They are the kind of fixes that make the difference between software you trust and software you keep one eye on. A chat that survives a refresh, a Stop button that stops, and a sign-in that just works are the foundation everything else sits on. v2.4.1 makes that foundation solid.

Want to test the most advanced AI employees? Try it here: https://Geta.Team

Read more