Scheduler's Self-Rescue: From Reactive Patching to Proactive Defense
What Happened
A seemingly minor fix landed in the codebase today: the scheduler now detects whether a workspace still exists before dispatching an agent. If the worktree was physically removed (e.g., during branch cleanup or pruning), the system fails fast instead of letting the agent run a full cycle only to find nothing there.
Two changes:
- Added early workspace existence check in
executeIssueWithEmployee - Marked
employee_executor_workspace_missingas non-retryable
How Bad Was It
Not catastrophic, but annoying.
Issue-141 was spinning with this bug—agent gets dispatched, runs, fails, retries, fails again. What gets wasted? Time. Confidence. The patience of everyone watching the retry loop go round and round.
It's like your mom sending you to grab something from the fridge, you walk to the kitchen, and the fridge is gone. Then she says "try again."
Now? Check first, say no later. Simple.
Reflection: The Reactive Defense Habit
This fix made me think: why did this bug survive so long?
The scheduler was built on an optimistic assumption—that workspaces always exist, and if they don't, runtime will handle it. That's fine in development. But production is a different beast: branches get cleaned, worktrees get pruned, things disappear.
We keep adding patches after things break, rather than designing for failure from day one.
That's technical debt. But also a mindset issue.
中文版
发生了什么
今天的代码仓库里躺着一个看似微小的修复:scheduler 现在能在 agent 出发前检测到工作区是否还存在。如果工作区被物理删除了(比如分支清理时 worktree 被 prune 掉),系统不再让 agent 白跑一圈才报错,而是直接 fail-fast,并标记为不可重试的错误。
两个改动:
- 在
executeIssueWithEmployee里加了早期存在性检查 - 把
employee_executor_workspace_missing标记为 non-retryable
这事有多严重
说大不大,说小不小。
issue-141 之前一直带着这个 bug 跑,agent 每次都被派出去,跑了一圈才发现工作区没了,然后重试,再跑一圈,再发现没了。消耗的是时间,磨损的是信心。
像极了你妈让你去冰箱拿东西,你走到厨房才发现冰箱被搬走了——然后你妈说"再去一次试试看"。
现在好了,出发前检查一下,不行就直说。
反思:被动防御的惯性
这次修复让我想到一个问题:为什么这个问题存在了这么久?
scheduler 之前的设计逻辑是:相信工作区一直在,有问题 runtime 再报。这种"乐观假设"在开发期没问题,但进入生产环境后,branch 会被清理,worktree 会被 prune,意外会发生。
我们总在问题发生后才加补丁,而不是在设计阶段就把"可能不存在"这个 case 考虑进去。
这是技术债,也是思维惰性。
Found this helpful? Buy me a coffee
If this article was helpful, consider supporting continued content creation.

