Documentation

Everything you need to build with Ejentum. From quickstart guides to advanced patterns.

Concepts

How the Logic API works and why it improves agent reasoning. If you just want to start using it, go to Quickstart.

The Core Idea

Most AI tools give agents more facts to work with (documents, data, context). Ejentum harnesses the reasoning power agents already have: structured injections that channel the model's existing capability into disciplined execution, preventing the shortcuts and decay that degrade output over long chains.

RAG (Retrieval Augmented Generation) retrieves information. The agent still decides how to reason about it using whatever patterns it learned during training.

RA²R (Reasoning Ability-Augmented Retrieval) retrieves reasoning abilities. A structured injection that governs how the agent thinks: what to focus on, what failure modes to block, and how to verify its own output.

Put simply: where RAG retrieves facts, RA²R retrieves ways of thinking. dynamically, based on the context of each query.

"Isn't this just prompt engineering?"

No. Prompt engineering is writing natural language instructions and hoping the model follows them. Ejentum injects a structured payload with four independent control surfaces: amplification, suppression, cognitive_style, and reasoning_elasticity. The suppression signals make this categorically different. they constrain the failure space, which natural language instructions cannot do reliably. A system prompt says "be careful." A suppression signal says "reject any output that exhibits symptom_treatment_bias." These produce structurally different model behavior in our testing.

Ejentum replaces a 5,000-token system prompt with a compact reasoning harness injected at the top of the context. Less content, higher attention density, structured constraints instead of prose. The harness does not add intelligence. It prevents the model's existing intelligence from being wasted.

The Four Harnesses

Ejentum houses 679 abilities across four harnesses. Each harness captures and directs a different dimension of the model's existing power. The model already has these capabilities. The harness prevents them from degrading under pressure.

Reasoning Harness (311 abilities) channels the model's analytical power across six cognitive domains. It prevents the shortcuts that turn careful analysis into surface-level pattern matching over long execution chains. Skill file.

Code Harness (128 abilities) channels the model's engineering discipline. It prevents hallucinated APIs, lost safety guards, ignored edge cases, and the subtle bugs that appear when the model generates plausible-looking code that fails in production. Skill file.

Anti-Deception Harness (139 abilities) channels the model's capacity for honesty. It prevents sycophancy, hallucination, prompt injection, and the tendency to tell people what they want to hear instead of what the evidence shows. Skill file.

Memory Harness (101 abilities) channels the model's observational depth. It prevents missed emotional shifts, ignored context drift, stale assumptions, and the tendency to treat every person and every turn the same way regardless of what changed. Skill file.

Reasoning Harness Dimensions

The Reasoning Harness spans six cognitive domains, each addressing a specific class of analytical failure:

1. Causality

Domain: Why things happen. Injects: Deductive rules, root-cause chains, falsification protocols. Prevents: Correlation-causation confusion, post-hoc reasoning, causal reversal.

2. Time

Domain: When things happen. Injects: Lag variables, decay rates, precedent logic, chronological strictness. Prevents: Temporal hallucination, the agent treating past as future or losing event sequence.

3. Space

Domain: Where things are. Injects: Boundary enforcement, topology validation, dimensional constraints. Prevents: Physical impossibilities like routes through walls, overlapping objects, broken continuity. Tested directly on ARC-AGI-3: a spatial navigation game where the injection forced intermediate path validation, preventing the agent from committing to blocked routes.

4. Simulation

Domain: What would happen if. Injects: Feedback loops, domino-effect tracking, systems archetypes. Prevents: Single-step thinking, agents that cannot model downstream consequences.

5. Abstraction

Domain: What things mean. Injects: Category enforcement, ontological boundaries, dimensionality control. Prevents: Concept conflation, treating metaphors as mechanisms, merging unrelated categories.

6. Metacognition

Domain: How the agent is thinking. Injects: Self-monitoring, contradiction detection, loop termination. Prevents: Hallucination spirals, infinite regression, cross-pillar contradictions.

The Failure Taxonomy: Why These Six

These dimensions were not chosen by analogy to human cognition. They were reverse-engineered from production agent failures.

We analyzed thousands of agent failures across domains and found they cluster into exactly six categories:

Failure ModeDimensionWhat Goes Wrong
Causal reversalCausalityAgent says A causes B when B causes A
Temporal hallucinationTimeAgent confuses past and future, loses event sequence
Physical impossibilitySpaceAgent violates boundaries, topology, conservation laws
Single-step myopiaSimulationAgent cannot model downstream consequences
Category errorAbstractionAgent conflates metaphor with mechanism
Hallucination spiralMetacognitionAgent makes a mistake, then uses that mistake to make more mistakes, without noticing

Every ability in the 679-node graph maps to one or more of these failure modes. The dimensions are not a taxonomy of knowledge. They are a taxonomy of breakdown.

What the API Returns

Every API response includes a structured cognitive injection with four control surfaces:

amplification: What to Amplify

An array of 2 to 4 reasoning signals the model must weight heavily.

Think of these as positive attractors. When the model generates its next tokens, these signals pull the output toward specific reasoning patterns.

["depth_first_root_search", "n_whys_traversal"]

suppression: What to Block

An array of 1 to 3 failure modes the model must actively penalize.

In our testing, suppression signals are more effective than amplification alone. Telling a model "don't treat symptoms as root causes" produces sharper reasoning than telling it "find the root cause." See the deep-dive below.

["symptom_treatment_bias", "surface_level_stop"]

cognitive_style: How to Reason

A single semantic persona anchor. This sets the tone and methodology of the agent's reasoning process.

"root_cause_isolation"

reasoning_elasticity: How Far to Explore

ValueBehavior
zero_driftRefuse to explore beyond immediate evidence. Maximum constraint.
conservativeEvidence-bound. Cautious extrapolation only when directly supported.
adaptiveBalanced exploration within logical constraints.
high_varianceBroad hypothesis generation. Accepts uncertainty.
max_entropyUnconstrained creative exploration. No guardrails.

Why Suppression Matters More Than Amplification

This is the empirical insight that drives the architecture.

When you tell GPT-4 "find the root cause," it generates a plausible root cause. When you tell GPT-4 "do NOT treat symptoms as root causes, do NOT stop at the first plausible explanation," it generates a deeper root cause. In our testing across thousands of queries, suppression-only payloads consistently outperformed amplification-only payloads on reasoning depth and failure avoidance. The combined payload (both amplification and suppression) performs best.

Why does this work? Our working hypothesis: negative constraints are more specific than positive instructions. "Find the root cause" is broad. the model can satisfy it shallowly. "Do NOT stop at the first plausible explanation" is narrow. it blocks a specific failure mode and forces the model to continue reasoning. We believe this relates to how instruction-following models process negation in context, though the full theoretical picture is still emerging.

Worked Example: Root Cause Analysis

Amplification signals tell the model: go deep, ask why repeatedly, extract systemic fixes.

Suppression signals tell the model: do NOT treat symptoms as causes, do NOT stop at the first plausible answer.

Remove the suppression signals and the model still performs root-cause analysis. But it stops earlier. It accepts the first plausible explanation. It treats correlated symptoms as causes. The suppression signals force the model past its natural stopping point, the point where probability says "good enough" but engineering says "not yet."

The Asymmetry Principle

Amplification is additive. It says "also consider X." Suppression is multiplicative. It says "reject every output that exhibits Y."

One suppression signal eliminates an entire class of failure modes. One amplification signal adds one more consideration to an already-noisy generation process. This asymmetry is why every ability contains suppression signals, even when amplification alone would seem sufficient.

The evidence: across two independent benchmarks (250 single-turn tasks), the factor lift ranking is identical. Self-monitoring improves most (+132%), followed by verification (+85%), then alternative consideration, epistemic honesty, reasoning depth, and audit trail. This ranking holds across published academic tasks and custom professional scenarios. A third benchmark (ARC-AGI-3, 50 interactive reasoning steps) extends this to multi-step execution: the same suppression signals that improve single-turn quality also prevent reasoning decay over 25-step chains, with injection persistence measured at a half-life of 24 steps. Three benchmarks, three task types, consistent directional effects. This is evidence for a mechanism, not luck.

Anatomy of a Cognitive Ability

Each of the 679 abilities is a complete cognitive injection with multiple control surfaces:

ComponentWhat It Does
Negative GateNames the specific failure pattern the agent must avoid
Reasoning ProcedureStep-by-step instructions the agent follows
Reasoning TopologyExecution structure: steps, decision gates, loops, exit conditions
Target PatternWhat correct reasoning looks like for this task type
Suppression VectorsFailure modes to actively block during generation
Amplification VectorsReasoning signals to prioritize
Cognitive StyleThe methodology anchor (e.g., root cause isolation, bayesian inference)
Falsification TestA verification criterion the agent checks its output against

Abilities are not independent. They form a directed graph of reasoning dependencies. One ability may require another as a prerequisite, enhance a third, or serve as a fallback if execution fails. In multi modes (Haki), the API retrieves the primary ability plus 3 cross-domain abilities, extracting failure guards and suppression signals that the primary alone wouldn't provide.

Choosing Reasoning Elasticity: When to Use What

Each ability sets a default elasticity. But you can override it when injecting. Here is the decision framework:

If your agent needs to...UseBecause
Audit financial data, verify compliancezero_driftNo hallucination tolerated. Facts only.
Debug a production incidentzero_drift or conservativeStay on the evidence trail.
Analyze quarterly trendsconservativeSome extrapolation, but evidence-bound.
Build a product strategyadaptiveBalance exploration with logical constraints.
Generate creative hypotheseshigh_varianceBroad idea generation, accepts uncertainty.
Brainstorm novel research directionsmax_entropyUnconstrained exploration. Use with Metacognitive fallback.

Warning: max_entropy without a Metacognitive suppression vector is dangerous. The agent will explore freely with no self-monitoring. Always pair max_entropy with a Metacognitive ability as a fallback node in multi-node injection (Haki tier).

How Retrieval Works

When you send a query, the API matches it against the abilities within your chosen mode using hybrid search. No LLM call, zero inference cost. The retrieval engine evaluates both the meaning and the specific terminology of your query to find the ability that best matches your task.

Multi-Ability Composition

Abilities are connected in a directed reasoning graph. Each ability has synergy relationships to others: prerequisites, enhancers, and fallbacks.

This means agents using the Haki tier don't just get random abilities. They get a primary ability — the best match for the task — plus failure guards extracted from 3 additional abilities in different cognitive domains. Each guard blocks a specific reasoning failure that the primary alone wouldn't catch. Before the agent commits to output, a self-check verifies it didn't fall into any of the guarded failure modes. If a check fails, the agent gets permission to break out of structured reasoning, think freely about what went wrong, and re-enter the structured path.

Worked Example: Supply Chain Risk

Query: "Identify why our Southeast Asia supplier is delivering late and predict when the disruption will resolve." with mode: "reasoning-multi"

The primary ability (Root Cause Miner, Causal) traces the mechanism of the delay. Suppresses symptom-treatment bias so the agent doesn't stop at "shipping delays."

From the 3 synergy abilities, the multi mode adds failure guards:

  • "Don't accept optimistic timelines as planning targets" — blocks the agent from giving you the timeline you want to hear
  • "Don't stop at first-order consequences" — forces modeling what happens downstream if the disruption continues
  • "Don't let contradictions stand unresolved" — catches inconsistencies between the causal analysis and the timeline prediction

Before output, a self-check fires: "Did I accept an optimistic timeline? Did I ignore downstream effects? Did I let contradictions stand?" If any check fails, the agent pauses, identifies the broken assumption, and re-approaches.

Where It Applies

The Logic API is domain-agnostic. The cognitive dimensions across all product layers map to failure patterns that occur in every industry. Here are the most common applications:

Software Engineering

Agents debugging production incidents, reviewing code, or planning migrations. Common failures: treating symptoms as root causes, missing cascading dependencies, stopping investigation at the first plausible fix. The API activates causal and metacognitive abilities that force deeper investigation.

Financial Services

Agents analyzing risk, forecasting, or compliance checking. Common failures: anchoring to best-case timelines, ignoring base rates, treating correlation as causation in market data. The API activates causal and temporal abilities that enforce evidence-based reasoning.

Legal Tech

Agents reviewing contracts, analyzing case law, or drafting compliance assessments. Common failures: conflating precedent with prediction, missing jurisdiction-specific constraints, accepting circular legal reasoning. The API activates abstraction and metacognitive abilities that separate fact from interpretation.

Healthcare

Agents supporting clinical reasoning, protocol selection, or risk assessment. Common failures: premature diagnosis fixation, ignoring constraint interactions, failing to flag uncertainty in ambiguous presentations. The API activates simulation and metacognitive abilities that enforce systematic evaluation.

Multi-Agent Orchestration

Systems where multiple agents collaborate on complex tasks. Common failures: contradictions between agents, duplication of reasoning without cross-validation, failure to synthesize competing outputs. Multi modes (Haki) are designed for this — cross-domain failure guards prevent the tunnel vision that single-mode reasoning produces.

In our benchmarks, multi-ability mode showed +10.1pp composite improvement on complex cross-domain tasks (EjBench). Single mode showed +8.0pp on focused tasks (BBH/CausalBench/MuSR). On multi-step execution (ARC-AGI-3), injection persistence across 25-step chains adds a third dimension. On hard competitive programming (LiveCodeBench Hard), the harness improved Opus 4.6 from 85.7% to 100% pass rate — rescuing reasoning spirals and preventing premature convergence on algorithm selection. See Benchmarks for the full methodology.

See all 13 industry applications with specific failure patterns, connected abilities, and benchmark evidence per vertical. Browse 29 real benchmark tasks with verbatim baseline vs harnessed outputs.


What Ejentum Does Not Do

Honest boundaries matter more than inflated claims.

  • Ejentum does not add domain knowledge. If your agent fails because it lacks information (e.g., it doesn't have access to your database), use RAG. Ejentum improves how the agent reasons about information it already has.
  • Ejentum operates at the prompt level, not the model level. It does not modify model weights, activations, or fine-tuning. It structures the prompt in ways that measurably improve reasoning output.
  • Retrieval precision depends on query quality. Highly ambiguous or very short queries may not retrieve the optimal ability. Query specificity directly impacts retrieval quality. send the full task description, not a summary.
  • Suppression is not absolute constraint enforcement. The model is steered toward avoiding specific failure modes, but LLMs are probabilistic systems. Suppression reduces failure rates significantly; it does not guarantee zero failures.

Next