# Agentic RAG in Practice: Building an Intelligent Retrieval System Integrating Semantic Search and Lexical Ranking

> This article deeply analyzes the design and implementation of a production-grade RAG system that combines agentic decision-making, vector semantic retrieval, and BM25 lexical ranking. It achieves hybrid ranking via Reciprocal Rank Fusion, providing a high-precision solution for complex multi-domain document retrieval.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-04-20T16:53:33.000Z
- 最近活动: 2026-04-20T17:19:17.909Z
- 热度: 154.6
- 关键词: RAG, Agentic AI, 语义搜索, BM25, 混合检索, Reciprocal Rank Fusion, 向量数据库, Claude, VoyageAI, 智能体
- 页面链接: https://www.zingnex.cn/en/forum/thread/agentic-rag
- Canonical: https://www.zingnex.cn/forum/thread/agentic-rag
- Markdown 来源: floors_fallback

---

## Agentic RAG in Practice: Guide to the Intelligent Retrieval System Integrating Semantic Search and Lexical Ranking

This article introduces the design and implementation of a production-grade RAG system that integrates agentic decision-making, vector semantic retrieval, and BM25 lexical ranking. It achieves hybrid ranking through Reciprocal Rank Fusion (RRF), addressing the limitations of traditional RAG's single strategy and providing a high-precision solution for complex multi-domain document retrieval. The system's core architecture includes an intelligent decision layer, dual-path retrieval layer, and fusion ranking layer, enabling the Claude model to independently determine retrieval timing and strategies, and adapt to cross-domain query scenarios such as medicine and finance.

## Background: Limitations of Traditional RAG

In LLM applications, traditional RAG is a standard solution for addressing knowledge timeliness and hallucinations, but it has shortcomings in complex scenarios: a single retrieval strategy struggles to balance exact matching and semantic understanding; there is a lack of dynamic interaction between retrieval and generation stages; and recall rate is insufficient for cross-domain document queries. Therefore, a more flexible and integrated RAG architecture is needed.

## System Architecture: Three-Layer Intelligent Design

The system's core architecture is divided into three layers:
1. **Intelligent Decision Layer**: Driven by Claude Sonnet 4.6, it empowers the model with independent judgment capabilities (whether to retrieve, which strategy to use, multi-round query refinement) to avoid unnecessary retrieval overhead.
2. **Dual-Path Retrieval Layer**: The semantic path is based on VoyageAI's voyage-3-large embedding model (cosine/Euclidean distance matching); the lexical path uses the BM25 algorithm (keyword exact matching), with complementary advantages.
3. **Fusion Ranking Layer**: Merges results via the RRF algorithm to reconcile ranking differences between different strategies and achieve more robust sorting.

## In-Depth Analysis of Technical Implementation

Technical details include:
- **Vector Index and Semantic Retrieval**: Custom VectorIndex class (adjustable parameters), voyage-3-large embedding model, supporting batch embedding and dimension verification; three chunking strategies (fixed length, semantic boundary, recursive character).
- **BM25 Lexical Retrieval**: Adjustable k1 (term frequency saturation rate) and b (document length normalization) parameters, supporting custom tokenizers (adapted to Chinese, code, etc.).
- **RRF Mathematical Principle**: Document score = harmonic mean of rankings from various strategies (formula: 1/(k+rank), k usually takes 60), no need for score normalization, strong robustness.
- **Agentic Query Flow**: Claude analyzes the query → determines retrieval strategy → executes retrieval → evaluates results → multi-round refinement (if needed), improving the quality of answers to complex questions.

## Application Scenarios and Practical Effects

The project's test documents cover 10 domains including medicine, software engineering, and finance, simulating enterprise multi-type knowledge base scenarios. For example, the cross-domain query "Financial impact and security risks of the XDR-471 project" requires integrating multi-domain knowledge. The system's decision-making process (whether to retrieve, which path to use, result sorting, etc.) can be intuitively observed through the Streamlit interface, improving debugging transparency.

## Deployment and Expansion Recommendations

Deployment dependencies are lightweight (Python3.9+, Chroma vector database, Streamlit frontend), making it easy to deploy on a single server or workstation. Expansion directions:
1. Retrieval path expansion (knowledge graph structured retrieval, metadata filtering);
2. Agentic strategy evolution (decomposing complex problems into sub-queries for parallel retrieval);
3. Introducing caching mechanisms (caching results for high-frequency queries to reduce API costs).

## Summary and Outlook

This project demonstrates the key features of next-generation RAG: from passive retrieval to active decision-making, from single strategy to multi-fusion, from black-box process to transparent and observable, which has reference value for enterprise knowledge base question-answering systems. Future evolution directions include agentization, multi-modal retrieval (text + image + table), real-time learning updates, etc.