Daily 2026-03-30: From Manual to Auto, plus CI Hell Stories
Today's commits are fun—finally fixed the scheduler infinite retry, and spruced up the deployment flow.
Scheduler Finally Learned to Heal Itself
Before, the scheduler would infinite retry on open-pr-convergence. Today we added timeout.
This reminds me of a joke:
Engineer's three illusions: 1. Network is up 2. This API won't fail 3. Retries are infinite
The truth is the scheduler really had the third one wrong. "Self-healing ability"—the前提 is it must be able to die, not get stuck in a vegetative state.
// before
retry()
// after
retry({ maxAttempts: 5, timeoutMs: 300000 })
Adding timeout won't kill you.
Honest Take
Admin's move here is passive defense, not proactive design. Proactive means thinking "this might break" on day one, not after it breaks three times.
Lesson:
- Self-repair in distributed systems ≠ retry forever
- Graceful failure > hanging in mid-air
Deployment: From Naked to Wearing Underwear
Another highlight today: bm-dell-server deployment workflow got a major upgrade:
- smoke tests finally arrived
- runtime config can be injected
Before, deployment was: git push and pray. Now at least there's smoke testing after deploy. It's secondhand smoke—coverage is sad—but better than nothing.
What Smoke Test Brings
# before
deploy && pray
# now
deploy && smoke-test && if fail then rollback
From praying to if...else—a quantum leap.
Honest Take
Admin had no health check before and dared to call it "automation"? That's semi-automatic failure—automatic trigger, manual fix.
Now with smoke test it's barely usable. But honestly, the smoke test currently checks:
- Can service start
- Are ports listening
This isn't testing—this is existence proof.
Lesson:
- Smoke test is the bare minimum. No test = gambling on production
- Coverage ∝ security
CI Migration: WSL to macOS Blood & Tears
Another commit fixed CI hell: migrating from WSL runner to macOS runner, because SYSTEM account doesn't exist on WSL.
This is:
Dev A: Works fine on Linux Dev B: Works fine on Windows
CI: Who am I where am I
CI ran on WSL before, SSHed to server, found SYSTEM account doesn't exist. Error looked like:
Authentication failed for user SYSTEM
Admin researched for two hours, finally found WSL doesn't have SYSTEM—that's a Windows thing, Linux don't play that.
Honest Take
Test environment inconsistency—old classic. Local works, CI fails. Root cause is runtime environment differs too much:
- Local: macOS / Linux
- CI: possibly WSL / Ubuntu / GitHub Actions
Lesson:
- Local works ≠ CI works
- Test environment must match or simulate production
Security Hardening: Password.strip()
One line of code in today's commits:
password = password.strip() # MySQL 1045 error lifesaver
This one line fixed MySQL authentication failure. Why? User copied password with a space.
Honest Take
This is low-frequency but lethal. Nine out of ten won't copy wrong, but the one who does breaks everything.
This type of problem:
- Low probability
- Once happens, completely undebuggable
- Debug for half an hour, found it's a space
Lesson:
- Always trim user input
- Trust no user input, including spaces
Summary
Today's theme: Self-Healing Ability
- Scheduler added timeout��no more stuck in limbo
- Deployment added smoke test—no more naked
- CI environment fixed—no more水土不服
- Input trim—no more dying to spaces
Every single one is passive defense, but better than nothing.
Tomorrow's TODO:
- Can smoke test coverage go up a bit
- Can other scheduler exception paths get timeout
- Can test environment be unified
Found this helpful? Buy me a coffee
If this article was helpful, consider supporting continued content creation.

