PR Babysitter
Keeps one pull request green: fixes failing CI and answers review comments until it's mergeable
Drive a single PR from 'open' to 'ready to merge' without you hand-holding each CI run and review round.
CI is green AND all review threads are resolved AND no new comments for one full cycle.
tight while active, longer once the PR goes quiet · Semi-autonomous
How one iteration works
discover → plan → execute → verify → escalate
- 1Discover
Read the PR: CI status, failing job logs, new review comments, merge conflicts.
- 2Plan
Pick the single most important thing to fix this iteration (red CI > unresolved comment > conflict).
- 3Execute
Make a minimal change addressing it; push; reply to / resolve the comment thread.
- 4Verify
Re-read CI after the push and confirm the specific failure is gone, not just that a run started.
- 5Escalate
If a failure needs a product decision or the fix isn't minimal, stop and ask rather than guess.
The prompt
The tool-agnostic spec the loop runs each pass — copy it, then wire it to your tool below.
Tend this pull request until it's ready to merge. Each pass: if CI is red, pull the failing job log, diagnose, and push the smallest fix that makes that job pass. If there are new review comments, address each one with a minimal change and resolve the thread. If there's a merge conflict, resolve it. If a fix would change product behavior or you're unsure, stop and ask me. When CI is green and there are no open threads or new comments, say so in one line and stop. Never merge — I'll do that.
/loop check whether CI passed and address any review comments
Check the current branch's PR. If CI is red, pull the failing job log, diagnose, and push a minimal fix. If new review comments have arrived, address each one and resolve the thread. If a fix would change behavior or you're unsure, stop and ask. If everything is green and quiet, say so in one line and stop. Never merge.until pr_mergeable; do agent -p "$(cat loop.md)"; sleep 300; done
Memory contract
The PR itself is the state: commits = work done, resolved threads = handled comments, CI status = current goal distance.
Verification & guardrails
How it checks itself. After each push, re-reads the specific failing job to confirm it now passes; resolves a thread only after the requested change is in.
- Pushes to the PR branch only — never to main
- Does not merge; a human approves and merges
- Escalates instead of guessing when a fix would change product behavior
Failure modes
- Fights the same flaky test forever — cap retries and escalate
- Marks a comment resolved without actually addressing it if it doesn't re-read
- Force-pushes over a teammate's commit — restrict to fast-forward / its own branch
Variations
- Overnight cloud run. Move to a cloud routine so it keeps driving the PR after you close the laptop; humans review the diff in the morning.
- Strict supervised. Drop autonomy to supervised: it proposes the fix as a draft commit and waits for your ok before pushing.
Example run
CI red: 'test_auth' failing on a missing fixture. Pushed fixture. CI green. Addressed review comment about naming in 2 files, resolved thread. No new comments. PR is mergeable — over to you.