Zing Forum

Reading

Aegis: An LLM Intelligent Routing and Hallucination Detection Gateway Based on Causal Inference

Aegis is a production-grade LLM gateway that automatically routes prompts to the most cost-effective model via a complexity classifier and uses causal inference technology to detect hallucinations without requiring ground truth labels. The system integrates semantic caching, multi-level risk detection, and real-time cost monitoring, providing a safe and cost-effective LLM invocation solution for high-stakes scenarios.

LLM网关因果推断幻觉检测智能路由成本优化语义缓存生产系统DoWhy安全网关
Published 2026-04-10 04:39Recent activity 2026-04-10 04:51Estimated read 8 min
Aegis: An LLM Intelligent Routing and Hallucination Detection Gateway Based on Causal Inference
1

Section 01

Introduction / Main Floor: Aegis: An LLM Intelligent Routing and Hallucination Detection Gateway Based on Causal Inference

Aegis is a production-grade LLM gateway that automatically routes prompts to the most cost-effective model via a complexity classifier and uses causal inference technology to detect hallucinations without requiring ground truth labels. The system integrates semantic caching, multi-level risk detection, and real-time cost monitoring, providing a safe and cost-effective LLM invocation solution for high-stakes scenarios.

2

Section 02

Dual Challenges of LLM Applications in Production Environments

When enterprises use large language models in production environments, they face two intertwined challenges. First is cost waste: simple queries are often routed to high-end models like GPT-4o, while in reality, Llama 3.1 (free) or Gemini Flash ($0.075 per 1M tokens) can provide answers of the same quality. Second is silent hallucination: LLMs produce confident, fluent but incorrect answers, which can lead to serious consequences in high-risk scenarios such as healthcare, law, and finance.

The Aegis project is designed to address these two issues. It is an end-to-end production system that not only implements intelligent routing but also, more importantly, introduces causal inference technology to detect hallucinations—without requiring ground truth labels as a reference.

3

Section 03

Core Architecture: More Than Just Routing

Routing solutions on the market (such as OpenRouter, LiteLLM) are already quite mature, and by 2026 this will become a commoditized feature. Aegis's differentiating value lies in its causal hallucination detection mechanism.

Traditional fact-checking methods require knowing the "correct answer" to judge whether the model output is accurate, but in production environments, we often do not have such references. Aegis uses an ingenious causal question: if only the wording of the question is changed, will the factual statement change?

If the model gives different answers to different formulations of the same question, this is a causal signal—the statement is not based on knowledge but on the surface features of the prompt. This is called a do(X) intervention in causal inference. This method does not require labels, ground truth answers, or external knowledge bases.

4

Section 04

Five-Level Routing and Cost Optimization

Aegis implements a five-level model routing based on complexity scoring:

Level Model Cost per 1M tokens Applicable Scenarios
Free Llama 3.1 8B (local Ollama) $0.00 Simple fact queries, conversations
Economic Gemini 1.5 Flash $0.075 Low-medium complexity
Standard GPT-4o-mini $0.150 Medium complexity
Premium Claude 3.5 Haiku $0.250 Medium-high complexity, requires detailed reasoning
High-end GPT-4o $2.500 Complex reasoning, high-risk scenarios

The complexity classifier uses a four-factor weighted score: semantic embedding norm (30%), text structure score (25%), question type score (25%), and domain keyword density (20%). The score ranges from 0.0 to 1.0, automatically routing to the most cost-effective capable model.

For the legal, medical, and financial domains, the system implements a hard gateway: regardless of the complexity score, GPT-4o is mandatory, and this rule cannot be overridden by the classifier.

5

Section 05

Level 1: Hedge Phrase Detection (Free, Full-Scale Operation)

The system scans responses for 25 confidence-weakening phrases, such as "I'm not sure", "I think", "maybe", "as far as I know", etc. Detecting 3 or more marks it as a potential hallucination (medium risk). This method is zero-cost and is executed for all requests from all providers.

6

Section 06

Level 3: Rewrite Variance Detection (Conditionally Triggered)

When the query belongs to the legal/medical/financial domain, or the complexity score exceeds 0.7, the system triggers deep detection:

  1. Use GPT-4o-mini to generate two different formulations of the same question
  2. Send three versions of the question (original + two rewrites) to the target model in parallel
  3. Calculate the average cosine similarity of the embedding vectors of the three responses
  4. Variance = 1 - average similarity

If the variance exceeds the threshold θ=0.35, it is marked as a high-risk hallucination. This threshold is not arbitrarily set; it is calibrated offline via the DoWhy library and confirmed for its causal rationality through placebo treatment refutation tests.

7

Section 07

Risk Level Merging

The final risk level takes the larger value between domain risk and detection risk: the legal/medical domain is naturally high risk, the financial domain is medium risk; rewrite variance detection triggers high risk, and hedge phrase detection triggers medium risk.

8

Section 08

Semantic Caching: Zero-Cost Hits

Aegis implements an in-memory cache based on sentence-transformers/all-MiniLM-L6-v2. The threshold is set to 0.85 (instead of 0.95, which has a hit rate below 1% in practice). When the cache is hit, the response time is about 5 milliseconds, with zero cost.

The embedding model instance is shared between the cache and the hallucination detector, avoiding repeated loading of approximately 90MB of model weights. The cache resets with server restart (a design choice for the demo environment).