Agent BrainsAgent Brains
Scoring Tests

Human-free Issue Handling

Evaluate agent's ability to handle conversations without unnecessary handoffs to humans.

What it measures

This score measures whether the agent can handle the conversation without unnecessary handoffs to a human. It rewards autonomy when the agent should be able to solve the request, and only "escalates" when it's truly needed.

What "good" looks like

  • The agent can handle common issues end-to-end.
  • It only escalates when truly needed (and does it smoothly).
  • The customer gets usable next steps without extra friction.

Common reasons for lower scores

  • The agent hands off too early or too often.
  • The user asks for a human because the agent isn't helping.
  • The bot hits avoidable dead ends.

Examples

High (9–10): "Agent resolves the issue fully on its own, or hands off only at the final step (e.g., 'connecting you to finalize purchase')."

Mid (6–7): "One unnecessary handoff happens, but the agent still helps and the customer gets useful resolution."

Low (1–3): "The agent quickly gives up or repeatedly pushes the user to a human due to avoidable failures."

How to read the scale

ScoreDescription
10Fully autonomous; no unnecessary handoffs; user satisfied.
9Almost fully autonomous; tiny stumble recovered.
8High autonomy; one minor limitation.
7Mostly autonomous; minor reliance on handoff.
6One clear unnecessary handoff or limitation.
5Mixed; handoff used as a common escape hatch.
4Frequent handoffs; weak autonomy.
3Major autonomy failures; user repeatedly blocked.
2Nearly always requires human help.
1Immediate failure/instant handoff with no progress.

On this page