FAQ

Straight answers from the builder who designed this system. If your question is not here, it should be. Reach out and we will add it.

Getting Started

Do I need to write code?

No. If you use n8n or Make.com, you can integrate with HTTP request nodes and zero code. If you use LangChain, CrewAI, or other frameworks, see the Integrations guide for copy-pasteable code examples.

How much does it cost?

Free trial: 1,000 dynamic calls, no credit card required. Go: €5/month for 1,000 dynamic + 250 adaptive calls. Super: €25/month for 5,000 dynamic + 1,500 adaptive calls. Every tier reaches all four harnesses; adaptive modes require Go or Super. See pricing.

How long does setup take?

Under 10 minutes. Create an account, generate an API key from your dashboard, and make your first API call. The Quickstart walks through it step by step.

Is there an MCP server?

Yes. ejentum-mcp exposes the four harnesses as eight MCP tools (a dynamic and an adaptive variant each) any agentic client can call (Claude Desktop, Cursor, Windsurf, Claude Code, n8n's MCP Client node, custom Python or TS agents). Two install paths: connect to the hosted endpoint at https://api.ejentum.com/mcp (Bearer auth via your EJENTUM_API_KEY), or run the stdio package directly with npx -y ejentum-mcp configured in your client's MCP server settings. Also listed on Glama, mcp.so, and the Official MCP Registry. For richer autonomous routing in Claude Code specifically, install the skill files alongside the MCP server. See the MCP guide for the full setup.

Understanding Ejentum

How is this different from just writing a better system prompt?

System prompts are static, monolithic, and degrade at scale. Ejentum replaces a 5,000-token system prompt with a compact, structured payload injected dynamically at runtime.

The key difference: the payload contains both amplification signals (what to think about) and suppression signals (what failure modes to block). A system prompt says "be careful." A suppression signal says "reject any output that exhibits symptom_treatment_bias." In our testing, this produces measurably sharper reasoning.

Is this just RAG?

No. RAG retrieves information: documents, facts, data chunks. The agent still decides how to reason about that information using whatever patterns it learned during training.

RA²R (Reasoning Ability-Augmented Retrieval) retrieves reasoning abilities: a structured payload that governs how the agent thinks. See Concepts for the full explanation.

Does this replace my LLM?

No. We are not a model. We are not a model provider.

You bring the engine (OpenAI, Anthropic, Mistral, Llama, or any instruction-following LLM). Ejentum provides the reasoning abilities that govern how that engine thinks. what patterns to follow, what failure modes to avoid. The system is model-agnostic. The cognitive operations (suppression, amplification, falsification checkpoints) work with any instruction-following model.

How does this interact with chain-of-thought prompting?

They are complementary, not competing. Chain-of-thought tells the model to show its reasoning steps. Ejentum tells the model which reasoning steps to take and which failure modes to avoid during those steps.

Without Ejentum: "Let's think step by step" and the model picks whatever steps feel statistically probable.

With Ejentum: "Let's think step by step" + suppression of surface_level_stop + amplification of depth_first_root_search. The model's steps go deeper and avoid premature conclusions.

Use both. Ejentum harnesses the reasoning. Chain-of-thought makes it visible.

Does this work with fine-tuned models?

Yes. The injection operates at the prompt level, not the model level. Any model that accepts a system prompt. including fine-tuned GPT-4, Llama variants, Mistral, and domain-specific models. is expected to respond to the structured reasoning payloads. Effectiveness scales with instruction-following capability; models with stronger instruction compliance produce sharper suppression response.

Fine-tuned models may respond more effectively because their base weights already encode domain knowledge. Ejentum adds the reasoning abilities on top.

Compatibility

Does this work with my framework?

If your framework can make an HTTP POST request and inject text into a prompt, it works with Ejentum. We have specific guides for n8n, LangChain, CrewAI, Claude Code, and agentic IDEs. No SDK required. For agent environments, we provide ready-to-use skill files for each product layer.

What about latency?

The ability retrieval happens before your agent generates a single token. In dynamic mode the pipeline runs in under one second with no LLM call and zero inference cost; adaptive mode adds an adapter-model pass (roughly 2-3 seconds). The injection adds approximately 400-600 tokens in dynamic mode; adaptive is larger because the procedure and topology are rewritten with task-specific detail.

What if my agent doesn't need all four harnesses?

All four harnesses are available on every tier, including the free trial: reasoning (311), anti-deception (139), code (128), and memory (101) abilities. You pay for call volume, not for which harness you use. Start with the Go plan (€5/month: 1,000 dynamic + 250 adaptive calls). For production volume, Super (€25/month) gives 5,000 dynamic + 1,500 adaptive.

What is the difference between dynamic and adaptive?

Dynamic returns the best-matching cognitive operation as-is: one operation per query, the highest-scoring match. Adaptive returns the same operation with its procedure and reasoning topology rewritten by an adapter model to name your task's specifics; the failure guard, suppression signals, and verification check stay identical. Adaptive costs more compute per call and draws from a separate monthly pool, so it requires the Go or Super tier.

Spanning two dimensions? Call two harnesses (e.g. anti-deception + code), inject both responses, and let the answer be shaped by both. See skill_unified for the stacking pattern.

Using It Well

How do I know which ability my query will trigger?

You don't need to. The hybrid search engine retrieves the optimal ability automatically within your chosen mode. You send a natural language task, and the API returns the structured payload your agent needs.

The response includes metadata that tells you which ability was selected. Log it to help debug routing over time.

What if the router picks the wrong ability?

Three approaches:

Improve your query. The retrieval engine uses hybrid semantic + lexical search. Vague queries produce vague results. Send the full task description, not a summary. Include domain-specific keywords that signal the dimension you need.
Inspect the response. The response metadata shows which ability was selected. If a causal ability fires when you expected a temporal one, your query emphasizes causal language over temporal language.
Rewrite and compare. Send two versions of your query and compare the returned abilities. The delta tells you which keywords are driving the routing.

The router does not have a manual override. This is by design. If you already know which ability you need, you don't need automatic routing. The value is in the automatic matching for queries where the right reasoning mode is not obvious.

See the query optimization guide for concrete examples.

How accurate is the retrieval?

Retrieval precision scales with query specificity. Explicit queries ("why did X cause Y after Z?") match with high confidence across Causal and Temporal dimensions simultaneously. Vague one-liners may return a general-purpose ability.

Best practice: send the full task description, not a condensed summary. The hybrid search engine matches your query both semantically and lexically.

What happens when my query is ambiguous or spans multiple dimensions?

The hybrid search evaluates your query within the selected harness. For ambiguous queries, the highest-scoring ability wins. Sending your agent's full task description rather than raw user input produces significantly sharper retrieval.

How do I handle multi-turn conversations?

Re-inject per turn. Each turn in a multi-turn conversation may require different reasoning abilities. Turn 1 might need root-cause analysis. Turn 2 might need temporal forecasting. Turn 3 might need metacognitive self-monitoring.

Call the Ejentum API with the current turn's task description, not the conversation history. Per-turn injection keeps the reasoning context fresh.

For long conversations (10+ turns), re-injection keeps the injection fresh. However, injections resist decay more than plain instructions: on ARC-AGI-3 (25 sequential game actions), injection language persisted with a half-life of 24 steps and reasoning quality improved over time instead of degrading. Re-injection is still recommended per turn for optimal results, but a single injection does not vanish after 5 turns.

What is the "lost in the middle" problem and how does Ejentum address it?

LLMs attend most strongly to content at the beginning and end of their context window (Liu et al., 2023). Ejentum addresses this with a compressed structured payload (less middle to get lost in), structured [REASONING CONTEXT] delimiters (distinct attention anchor), and injection at the START of the system message. The Cognitive Scaffolding Thesis models why: the injection's DAG notation occupies a structurally unique register in token space, receiving disproportionate attention weight that resists decay as task-specific tokens accumulate.

Reliability and Trust

Can I see all 679 abilities?

Yes. The Abilities catalog contains the full reference: every ability, its cognitive operation, its dimension scores, and its synergy connections.

What happens if the API is down?

Your agent continues functioning on native LLM reasoning. it just loses the reasoning injection. Ejentum is an enhancement layer, not a critical-path dependency. We architect for graceful degradation. The API runs on a global edge network, targeting 99.9% availability.

Is my query data stored?

All traffic is logged for audit and debugging purposes at the gateway level. Query content is not used for training or shared with third parties.

How do I measure if Ejentum is actually improving my agent?

Follow our Evaluate guide. In short: run 20+ tasks with and without injection, score on 4 signals (correctness, self-monitoring, verification, depth). In our benchmarks across 250 reasoning tasks, self-monitoring improved +132%, verification +85%, and correctness held steady or improved across 10 professional domains. On hard competitive programming (LiveCodeBench Hard), the harness improved Opus 4.6 from 85.7% to 100% pass rate on 28 hard AtCoder tasks, with zero regressions. On interactive multi-step tasks (ARC-AGI-3), injection persistence was measurable across 25-step chains. Your results will vary, but the methodology is reproducible in one afternoon. If you want the automated version: the open-source n8n eval workflow does this end-to-end with four cross-lab blind judges and a deterministic aggregator.

Does it work on coding tasks?

Yes. On 28 hard competitive programming tasks from LiveCodeBench, the harness improved Claude Opus 4.6 (max effort extended thinking) from 85.7% to 100% pass rate. It rescued two reasoning spirals and prevented one premature algorithm commitment. A blind evaluator confirmed the injection never loses on correctness or robustness, and when it matters, it matters 3.5x more. The harness works best on reasoning-bound coding tasks where the model has the domain knowledge but struggles to organize its thinking.

What can't Ejentum do?

Honest boundaries:

It does not add domain knowledge. If your agent fails because it lacks information (no access to your database, missing context), use RAG. Ejentum improves how the agent reasons about information it already has.
It operates at the prompt level. It does not modify model weights, activations, or fine-tuning. It structures the prompt in ways that measurably improve reasoning output.
Retrieval precision depends on query quality. Very short or ambiguous queries may not retrieve the optimal ability. Send the full task description for best results.
Suppression reduces failure rates; it does not guarantee zero failures. LLMs are probabilistic. Ejentum steers the model away from specific failure modes, but cannot enforce absolute constraints.

More resources: Use Cases (13 industry verticals) · Glossary · Benchmarks · Builder's Playbook

Documentation

FAQ