tumf

Posted on Feb 3 • Originally published at blog.tumf.dev

ralph-claude-code: The Technology to "Stop" AI Agents — How the Circuit Breaker Pattern Prevents Runaway Processes

#ai #llm #devops

Originally published on 2026-01-11
Original article (Japanese): ralph-claude-code: AIエージェントを"止める"技術——サーキットブレーカーパターンが暴走を防ぐ仕組み

"Claude Code created six repositories overnight when run in an infinite loop" — this experiment from a YCombinator hackathon has garnered attention. The implementation behind this is ralph-claude-code.

While the autonomous loops of AI agents are powerful, determining "when to stop" is the biggest challenge. If you misjudge the stopping point, you could face exploding API costs or get stuck in an endless loop of the same error. ralph-claude-code addresses this issue with a circuit breaker pattern and an end detection algorithm.

In this article, we will focus on the internal logic of the implementation that hasn't been covered in existing articles, explaining how the "stopping technology" is realized.

What is ralph-claude-code?

Origin of the Ralph Technique

This method was proposed by Geoffrey Huntley, and the simplest form is the following Bash loop:

while :; do cat PROMPT.md | claude-code ; done

This loop is based on the idea of "producing deterministically bad results in a nondeterministic world." In other words, since the failure patterns are clear, they can be improved through tuning.

Role of ralph-claude-code

frankbria/ralph-claude-code is an implementation that adds the following features to this simple loop:

Intelligent End Detection: Automatically determines task completion
Circuit Breaker: Prevents infinite loops
Rate Limit Management: Controls API usage
Session Continuity: Maintains context

As of version 0.9.8, 276 tests have passed 100%, indicating a quality close to production use.

The Core of "Stopping Technology": Circuit Breaker Pattern

Three States of the Circuit Breaker

The circuit breaker in ralph-claude-code applies the circuit breaker pattern used in microservice architectures to AI agents.

State	Description	Action
Closed (Normal)	Operating normally	Continues the loop
Open (Circuit Open)	Anomaly detected	Stops the loop
Half-Open (Half-Open)	Recovery monitoring	Gradually resumes

Trigger Conditions for Circuit Open

The circuit breaker opens under the following conditions:

1. No Progress Loop (3 times)

CB_NO_PROGRESS_THRESHOLD=3

If there are no file changes for three consecutive loops, it is deemed to have no progress.

2. Repeated Same Error (5 times)

CB_SAME_ERROR_THRESHOLD=5

If the same error message is output for five consecutive loops, it is determined to be a stack loop.

3. Significant Decline in Output (Over 70% reduction)

CB_OUTPUT_DECLINE_THRESHOLD=70

If the output decreases by over 70% compared to the previous loop, it is deemed anomalous. This indicates that the AI agent may be in a "nothing to do" state.

Two-Stage Error Filtering

A notable aspect of the circuit breaker implementation is the two-stage filtering to prevent false positives.

Problem: Misidentifying the JSON response "is_error": false as an error.

Solution: Filtering in the following two stages:

First Stage: Recognizes the JSON structure and excludes field names.
Second Stage: Matches error messages that span multiple lines.

# First Stage: Exclude JSON fields
if [[ $line =~ \"is_error\".*false ]]; then
  continue
fi

# Second Stage: Multi-line support
if [[ $current_block =~ $error_pattern ]]; then
  error_count=$((error_count + 1))
fi

This mechanism significantly reduces the false positive rate.

End Detection Algorithm

In addition to the circuit breaker, ralph-claude-code also has a mechanism for detecting normal termination.

Types of End Signals

1. Task Completion Detection

If all tasks in @fix_plan.md are marked as complete, it will terminate.

2. Consecutive Done Signals (2 times)

MAX_CONSECUTIVE_DONE_SIGNALS=2

If Claude Code returns a "done" response twice in a row, it will terminate.

3. Detection of Test-Only Loops (3 times)

MAX_CONSECUTIVE_TEST_LOOPS=3

If it runs tests only three times in a row without implementing functionality, it is deemed that feature development is complete.

Handling Claude's 5-Hour API Limit

Claude has a usage limit per five hours. ralph-claude-code automatically detects this and offers the following options:

Wait 60 minutes: Wait with a countdown timer.
Terminate: Automatically terminate after 30 seconds.

This prevents unnecessary retry loops.

Mechanism of Session Continuity

Since version 0.9.6, ralph-claude-code supports session continuity.

Structure of Session File

# .ralph_session
SESSION_ID="ses_abc123def456"
CREATED_AT="1704585600"  # Unix timestamp
EXPIRES_AT="1704672000"  # 24 hours later

Automatic Reset Triggers

The session is automatically reset under the following events:

When the circuit breaker opens: Stagnation detected.
Manual interruption (Ctrl+C): User's explicit stop.
When the project is complete: Normal termination.

Session history is recorded in .ralph_session_history with a maximum of 50 entries.

Considerations for Practical Use

Reality of Token Consumption

Existing articles mention "extremely high token consumption," but actual data from the YCombinator hackathon has been released:

During the YCombinator hackathon, it consumed about $800 to port six repositories overnight, generating over 1,100 commits. The cost per Sonnet agent was approximately $10.50/hour.

Reference: We Put a Coding Agent in a While Loop and It Shipped 6 Repos Overnight

Adjusting Rate Limit Settings

The default of 100 calls/hour may be excessive for some projects. Adjustment examples:

# For small projects
ralph --calls 50

# In a hurry (cost caution)
ralph --calls 150

# Budget-conscious
ralph --calls 30

Tips for Prompt Tuning

Geoffrey Huntley's "signboard tuning" method is effective:

First Loop: Let Ralph do its thing freely.
Observe Failure Patterns: Identify issues from logs.
Add Signboard to PROMPT.md: "If this error occurs, please do ~."
Re-run: Ralph will see the signboard and improve.

Example:

# PROMPT.md

## ⚠️ Notes

- If a test fails, please read the error message before fixing it.
- If the same error occurs three times, please change your approach.
- Check dependencies before deleting files.

Differences from Official Plugins

The "Ralph Wiggum" official plugin from Claude Code and the frank bria version of ralph-claude-code have similar names but different implementations.

Feature	Official Plugin	frankbria Version
Circuit Breaker	❌ None	✅ Present
Two-Stage Error Filtering	❌ None	✅ Present
Session Continuity	❌ None	✅ Present
Test Coverage	❓ Unknown	✅ 276 tests 100% pass
tmux Integration	❌ None	✅ Present

The official plugin is positioned as a "simple loop," while the frankbria version is a "production-quality autonomous loop."

Conclusion

The "stopping technology" of ralph-claude-code is realized through the following three mechanisms:

Circuit Breaker: Detects no progress, repeated errors, and output decline to stop.
End Detection Algorithm: Determines task completion, done signals, and test-only loops.
Session Management: Maintains context while automatically resetting during anomalies.

These mechanisms enable long-duration autonomous execution, such as "completing six repositories overnight."

Personally, I find the two-stage error filtering approach particularly interesting. The implementation that understands JSON structure and excludes field names is a valuable reference for quality control in LLM tools.

ralph-claude-code is still at v0.9.8 and is under development towards v1.0. If you're interested, check out the GitHub repository.

DEV Community