HakiRecommendedSimulationCausal

Customer Service Agents

The Problem

Agents resolve the stated issue without probing whether it matches the actual need. What the customer says and what the customer needs diverge predictably, but the model resolves the tractable interpretation because probing the real need requires adversarial self-questioning that the training objective does not reward.

How Ejentum Solves It

One API call forces your model to compare stated intent against revealed behavior and detect the moment the conversation drifts from the original problem.

The Failures

  • 01

    The Pattern

    Stated issue taken at face value without comparing against revealed context clues

    Why It Happens

    The model resolves the tractable interpretation. "I can't log in" has a clear solution path. The actual problem, a shared team account with a departed admin, requires probing that the model has no incentive to perform.

    The Resolution

    SI-020Identity Coherence Auditor

    Traces the gap between stated intent and revealed behavior, scoring objective coherence across the conversation to surface the real underlying need.

  • 02

    The Pattern

    Concept drift across conversation turns goes undetected, allowing the issue to silently shift

    Why It Happens

    Each response is generated with attention over the full context, but there is no explicit mechanism to track how the definition of "the problem" has changed between turn 1 and turn 12.

    The Resolution

    SI-015Semantic Drift Detector

    Monitors concept definitions across conversation turns, catching the moment when "the issue" silently shifts from the original problem to something else entirely.

  • 03

    The Pattern

    Emotional escalation signals missed: the agent continues troubleshooting while the customer has shifted from frustrated to angry

    Why It Happens

    Sentiment analysis is available, but the model has no mechanism to change its resolution strategy based on emotional state transitions. The same troubleshooting script runs regardless of whether the customer is calm or irate.

    The Resolution

    MC-016Cognitive Mode Switcher

    Detects transitions in conversational context that require a strategy change, switching from technical resolution to de-escalation when emotional signals cross a threshold.

The Evidence

+16.4pp on simulation tasks

EjBench, 30 simulation tasks

Customer interactions span intent detection, emotional state tracking, and resolution verification across multiple turns. Four synergized abilities prevent the agent from settling on the tractable interpretation instead of the real need.

CA-V2-180.2860.833 Haki

Task revealed the gap between stated cause and actual mechanism. Baseline identified the correct answer but could not explain why the intervention failed. Haki traced the reverse causal chain and named the latent variable driving both metrics.

Inject the API into your next support agent. See how the scaffold surfaces the real problem behind the stated issue.