Stuck Detection

Agents can get stuck — repeating the same action, cycling between two tools, or failing to make progress toward the objective. Stuck detection identifies these patterns and triggers a recovery strategy.


Configuration

json
{
  "on_stuck": {
    "iterations": 3,
    "action": "escalate",
    "hint": "Try a different approach or use __complete__ with your best answer."
  }
}
Property Type Required Description
iterations number No Number of repeated iterations before triggering (default: 3)
action string No Recovery strategy: "fail", "escalate", or "retry_with_hint" (default: "fail")
hint string No Guidance text injected into the prompt (only used with "retry_with_hint")

How Detection Works

The engine monitors the agent's action history across iterations. It detects stuck patterns when:

  • The agent calls the same action with the same parameters for N consecutive iterations
  • The agent alternates between two actions without producing new information
  • The agent produces repeated identical thoughts suggesting it's not incorporating new observations

When any pattern is detected for the configured number of iterations, the recovery strategy fires.


Recovery Strategies

"fail" — Terminate

The agent loop ends immediately with a failure status. Use when you'd rather fail fast than risk incorrect results.

json
{ "on_stuck": { "iterations": 3, "action": "fail" } }

The run status becomes failed with a clear error message indicating the agent was stuck.

"escalate" — Pause for Human

The agent pauses and the run is routed to a human reviewer, just like __pause_for_human__. The human sees the agent's full reasoning trace up to the point it got stuck, plus context about the stuck pattern.

json
{ "on_stuck": { "iterations": 3, "action": "escalate" } }

The human can provide guidance via POST /agents/:id/resume, and the agent continues with the human's input as additional context. This is the recommended strategy for production agents in critical workflows.

"retry_with_hint" — Redirect

The engine injects a hint into the agent's next prompt, nudging it toward a different approach. The agent gets one more chance to make progress.

json
{
  "on_stuck": {
    "iterations": 3,
    "action": "retry_with_hint",
    "hint": "You seem to be repeating yourself. Try using __complete__ to provide your best answer with the information you have, or try a different tool."
  }
}

If the agent gets stuck again after the hint, it fails. The hint is a one-shot redirect, not an infinite retry.


Setting Thresholds

iterations: 2 — Aggressive. Triggers quickly but may false-positive on agents that legitimately need to retry a tool (e.g., polling for an async result).

iterations: 3 — Balanced. The default recommendation. Catches genuine loops while tolerating one legitimate retry.

iterations: 5 — Conservative. Use for agents with tools that may legitimately return different results on repeated calls (e.g., searching with refined queries).


Example: Stuck Detection in Practice

An agent investigating an unmatched invoice:

text
Iteration 1:
  Thought: "Let me search for related payments."
  Action: search_payments { "invoice_id": "INV-099" }
  Observation: { "results": [] }

Iteration 2:
  Thought: "No results. Let me try searching again with a broader query."
  Action: search_payments { "invoice_id": "INV-099" }   ← same action, same params
  Observation: { "results": [] }

Iteration 3:
  Thought: "Still nothing. Let me search once more."
  Action: search_payments { "invoice_id": "INV-099" }   ← stuck detected!

With "action": "escalate", the agent pauses here and a human reviews the trace. The human might respond: "This invoice is from a new vendor not yet in the payment system. Mark as pending vendor setup."

With "action": "retry_with_hint" and hint "No payments exist for this invoice. Consider whether it might be from a new vendor or a different system.", the agent gets one more iteration with this additional context.


Best Practices

  • Always configure on_stuck for production agents. The default "fail" is safe but gives you no recovery path. "escalate" is almost always better.

  • Pair stuck detection with max_iterations as a hard ceiling. Stuck detection catches loops; max_iterations catches everything else.

  • Write hints that suggest alternatives, not just "try again." Good: "If the search returns no results, use complete to report that no match was found." Bad: "Try harder."

  • Monitor stuck rates via the hyphen_agent_stuck_total metric. High stuck rates indicate the objective is unclear, tools are insufficient, or the model needs more guidance.

→ Next: Reasoning Traces