What can go wrong with this loop?

Loops forever with no verification or memory — the checklist + tests are what make it terminate Repeats the same item if it doesn't actually check it off Drifts from the spec on long runs — keep the spec the single source of truth and re-read it each pass Silently lowers the bar by weakening tests to pass — forbid editing tests to make them pass

Ralph Build Loop

The canonical Ralph: feed the agent one spec file over and over until the whole thing is built and the checklist is empty

L2 · Verify-until-done Ralph Medium risk Semi-autonomous Run until goal production

What it does

Build out a single, well-specified body of work that just needs many passes, without you re-prompting between each one.

Stops when

Every item in the spec's checklist is implemented, tested, and checked off.

Runs

back-to-back passes until the checklist is empty · Semi-autonomous

How one iteration works

discover → plan → execute → verify → escalate

1
Discover
Read the spec/checklist file and find the highest-priority unchecked item.
2
Plan
Decide the smallest next step that completes that one item.
3
Execute
Implement it, write/extend its test, and check the item off in the spec file.
4
Verify
Run the test suite; if the new test (and the rest) pass, keep the checkmark — otherwise revert and leave the item open with a note.
5
Escalate
If an item is underspecified or blocked, write a question into the spec file and move to the next item.

The prompt

The tool-agnostic spec the loop runs each pass — copy it, then wire it to your tool below.

Read the spec file and its checklist. Take the single highest-priority unchecked item. Implement the smallest change that completes it and write or extend a test for it. Run the full test suite. If everything passes, check the item off in the spec file. If it fails, revert your change and leave the item unchecked with a one-line note on why. If an item is unclear or blocked, write your question inline in the spec and move on. Do not weaken existing tests to make them pass. Repeat until every item is checked. When the checklist is empty, stop and summarize what was built.

Generic

while grep -q '\[ \]' SPEC.md; do agent -p "$(cat ralph-prompt.md)"; done

ralph-prompt.md

Read SPEC.md. Take the highest-priority unchecked item. Implement the smallest change that completes it + a test. Run the full suite. If green, check the item off in SPEC.md; if red, revert and leave a note. If unclear, write the question inline and move on. Never weaken tests to pass.

Claude Code

/loop work the next unchecked item in SPEC.md until the checklist is empty

Memory contract

The spec/checklist file IS the memory: unchecked items = remaining work, checkmarks = done, inline notes = blockers/questions. Re-read fresh each pass.

Verification & guardrails

How it checks itself. The test suite is the gate every pass: an item is 'done' only when its test and the full suite pass; failures revert the checkmark.

One checklist item per pass — keeps changes bounded and reviewable
A checkmark only stays if tests pass; otherwise the change is reverted
Blocked/underspecified items become written questions, not guesses

Failure modes

Loops forever with no verification or memory — the checklist + tests are what make it terminate
Repeats the same item if it doesn't actually check it off
Drifts from the spec on long runs — keep the spec the single source of truth and re-read it each pass
Silently lowers the bar by weakening tests to pass — forbid editing tests to make them pass

Variations

Parallel Ralph. Run several Ralph loops in separate worktrees over disjoint sections of the spec, then merge — early step toward an orchestrated (level 5) system.
Plan-first. Begin with a pass that expands a terse spec into a detailed checklist, then Ralph the checklist.

Example run

Pass 7: item 'add rate-limit middleware' -> implemented + test, suite green, checked off. 4 items left. Pass 8: item 'cache layer' underspecified -> wrote question inline, moved on.