Bridging the semi-async valley of death
Faster Inference Won't Save You: Part 4
swyx coined the term "semi-async valley of death" for the latency range where coding agents spend most of their time. Under five seconds, the developer stays in flow — the agent feels instant. Over several minutes, the developer walks away and comes back when it's done — the agent is a background job. Between those two? The developer just sits there. Watching a spinner. Too slow to stay focused, too fast to context-switch to something else. The valley.
The industry's response has been to make agents faster. Faster models, parallel tool calls, RL-trained search, prefix caching. Parts 1, 2, and 3 of this series covered all of that — reducing turns, eliminating context rot, optimizing the plumbing. Push everything under the five-second flow window.
But some work can't be made fast. A full type check on a large TypeScript project takes 30 seconds. Running the test suite takes minutes. A multi-file refactor that touches 40 files needs time to think. You can optimize the agent all you want — the underlying compute has a floor.
The other escape route is up, not left. Make the slow work truly async. Let the agent outlive your terminal session, outlive your laptop's lid closing, outlive your attention span. When you come back, it's done.
The valley exists because agents are pinned to your machine
Your laptop has 8 cores, maybe 16. The coding agent needs cores for type checking, LSP queries, builds, test runs. So does your IDE. So does your browser. So does Slack. The agent competes with everything else on the machine, and the machine is a laptop you bought for portability, not compute density.
When the agent kicks off a build that saturates your CPU, everything slows down. Your editor lags. The agent itself slows down because it's waiting for its own type checker. You can't speed it up — you're out of cores. You can't work on something else locally — the machine is pegged. You're stuck watching.
And if you close the laptop, the agent dies. The process terminates. The WebSocket drops. Whatever the agent was doing is gone. Most agents can't outlive the session that started them. The agent is a child process of your terminal, and when the terminal closes, the child dies.
This is the shape of the valley. The agent is CPU-bound on your hardware, so it can't get fast enough for flow. The agent is session-bound to your machine, so it can't go truly async. It's stuck in between.
Cloud agents exist. Nobody uses them.
Fully async cloud agents have been around for a while. Devin, Factory, Codegen, various open-source runners. They spin up a remote machine, clone the repo, work for hours, open a PR when they're done. The agent outlives your session because it was never on your machine to begin with.
But they're a different product. You interact with them through a web dashboard, not your terminal. They have their own conversation threads, their own file browser, their own diff viewer. There's no connection to your local environment — your working branch, your uncommitted changes, the file you had open, the context you'd built up. You can't start exploring a problem locally and then say "actually, run this part in the cloud." You either use the local agent or the cloud agent. Two separate tools for two separate workflows.
The result: developers use local agents for interactive work and cloud agents for batch jobs they can fully specify upfront. The valley persists for interactive work because the bridge between local and cloud doesn't exist. You can't cross it mid-task.
Event sourcing makes the bridge invisible
What if the agent's state wasn't on your machine at all?
In our architecture, agent state is an append-only event log stored in a Durable Object on Cloudflare. Every action the agent takes — every LLM call, every tool execution, every file edit, every task status change — is an event in this log. The log lives in the cloud. Your local machine is just one possible host for running the agent loop against that log.
Migration looks like this: the local machine finishes whatever the agent is currently doing, disconnects from the Durable Object. A cloud sandbox connects to the same DO, replays the event log, reconstructs the full workspace from commit events embedded in the log. No disk snapshots shipped over the wire — the file state is encoded in the events themselves. The agent picks up exactly where it left off.
Your laptop can shut down. The cloud sandbox keeps working. The Durable Object keeps accumulating events — tool results, file edits, LLM responses. When you reconnect — from the same laptop, from your phone, from a different computer — you attach to the DO, replay the events you missed, and you're caught up. The agent might already be done.
The local machine isn't special. It's not the "home" of the agent. It's a node that was temporarily running the agent loop, and now a different node is running it. The event log is the state, and any node that can replay it can be the host.
Unlimited cores, zero contention
The cloud sandbox isn't a shared machine. Each agent gets its own sandbox with its own filesystem, its own CPU cores, its own LSP server, its own type checker. When you spawn a subtree agent to handle a refactor, it gets dedicated compute that doesn't compete with anything.
Spawn ten agents and they each get their own cores. Type checking in one sandbox doesn't slow down builds in another. None of them contend with your laptop. The subtree model — each agent works on a subtree of the codebase in an overlay filesystem — means they don't conflict on disk either. When an agent finishes its work, it merges back through the event protocol. No filesystem locks, no git conflicts from concurrent edits to the same branch.
This is the vertical escape from the valley. Work that takes 30 seconds on your 8-core laptop takes 30 seconds on a cloud machine too, but your laptop is free. Run it on ten cloud machines in parallel and the wall-clock time for a large refactor drops by an order of magnitude — while you keep working locally on something else, or close your laptop entirely.
Bridging, not choosing
The point isn't that cloud agents are better than local agents. Local is better for interactive exploration, quick edits, anything where the feedback loop needs to be tight. The point is that the transition between local and cloud should be invisible. You shouldn't have to choose upfront. You shouldn't have to re-specify the task. You shouldn't have to copy files or export state.
Start locally. Explore the problem. When the work gets heavy — large refactor, full test suite, multi-file migration — the agent migrates to the cloud. It doesn't know it moved. The DO is the same. The event log is the same. The workspace was reconstructed from the same events.
Parts 1 through 3 made each turn fast. Part 4 makes it so the turns you can't make fast enough don't have to block you. They run somewhere else. Your laptop is free. And when you open it back up, the work is done.