Ralph Build Loop

The canonical Ralph: feed the agent one spec file over and over until the whole thing is built and the checklist is empty

L2 · Verify-until-done Ralph Medium risk Semi-autonomous Run until goal production
What it does

Build out a single, well-specified body of work that just needs many passes, without you re-prompting between each one.

Stops when

Every item in the spec's checklist is implemented, tested, and checked off.

Runs

back-to-back passes until the checklist is empty · Semi-autonomous

How one iteration works

discover → plan → execute → verify → escalate

  1. 1
    Discover

    Read the spec/checklist file and find the highest-priority unchecked item.

  2. 2
    Plan

    Decide the smallest next step that completes that one item.

  3. 3
    Execute

    Implement it, write/extend its test, and check the item off in the spec file.

  4. 4
    Verify

    Run the test suite; if the new test (and the rest) pass, keep the checkmark — otherwise revert and leave the item open with a note.

  5. 5
    Escalate

    If an item is underspecified or blocked, write a question into the spec file and move to the next item.

The prompt

The tool-agnostic spec the loop runs each pass — copy it, then wire it to your tool below.

Read the spec file and its checklist. Take the single highest-priority unchecked item. Implement the smallest change that completes it and write or extend a test for it. Run the full test suite. If everything passes, check the item off in the spec file. If it fails, revert your change and leave the item unchecked with a one-line note on why. If an item is unclear or blocked, write your question inline in the spec and move on. Do not weaken existing tests to make them pass. Repeat until every item is checked. When the checklist is empty, stop and summarize what was built.
Generic
while grep -q '\[ \]' SPEC.md; do agent -p "$(cat ralph-prompt.md)"; done
ralph-prompt.md
Read SPEC.md. Take the highest-priority unchecked item. Implement the smallest change that completes it + a test. Run the full suite. If green, check the item off in SPEC.md; if red, revert and leave a note. If unclear, write the question inline and move on. Never weaken tests to pass.
Claude Code
/loop work the next unchecked item in SPEC.md until the checklist is empty

Memory contract

The spec/checklist file IS the memory: unchecked items = remaining work, checkmarks = done, inline notes = blockers/questions. Re-read fresh each pass.

Verification & guardrails

How it checks itself. The test suite is the gate every pass: an item is 'done' only when its test and the full suite pass; failures revert the checkmark.

  • One checklist item per pass — keeps changes bounded and reviewable
  • A checkmark only stays if tests pass; otherwise the change is reverted
  • Blocked/underspecified items become written questions, not guesses

Failure modes

  • Loops forever with no verification or memory — the checklist + tests are what make it terminate
  • Repeats the same item if it doesn't actually check it off
  • Drifts from the spec on long runs — keep the spec the single source of truth and re-read it each pass
  • Silently lowers the bar by weakening tests to pass — forbid editing tests to make them pass

Variations

  • Parallel Ralph. Run several Ralph loops in separate worktrees over disjoint sections of the spec, then merge — early step toward an orchestrated (level 5) system.
  • Plan-first. Begin with a pass that expands a terse spec into a detailed checklist, then Ralph the checklist.

Example run

Pass 7: item 'add rate-limit middleware' -> implemented + test, suite green, checked off. 4 items left. Pass 8: item 'cache layer' underspecified -> wrote question inline, moved on.