# Legal AI Assistant: A Legal Intelligent Q&A System Based on Agentic RAG

> An intelligent Q&A system for legal professionals that integrates the LangGraph agent architecture, FAISS vector retrieval, and the Llama 3.3 70B large language model to enable accurate legal document retrieval and structured answer generation.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-05-16T02:19:58.000Z
- 最近活动: 2026-05-16T02:34:24.617Z
- 热度: 159.8
- 关键词: RAG, LangGraph, 法律 AI, FAISS, Llama 3.3, 智能问答, Agentic RAG, 向量检索
- 页面链接: https://www.zingnex.cn/en/forum/thread/legal-ai-assistant-agentic-rag
- Canonical: https://www.zingnex.cn/forum/thread/legal-ai-assistant-agentic-rag
- Markdown 来源: floors_fallback

---

## [Main Floor] Legal AI Assistant: Introduction to the Agentic RAG-Based Legal Intelligent Q&A System

Legal AI Assistant is an intelligent Q&A system for legal professionals. It integrates the LangGraph agent architecture, FAISS vector retrieval, and the Llama 3.3 70B large language model to address the pain points of traditional general AI in legal scenarios—such as being prone to hallucinations and having insufficient accuracy. As a graduation project for ITI's Generative AI course, it demonstrates a complete tech stack from prompt engineering to agent evaluation, providing legal practitioners with a practical intelligent research assistant.

## Project Background and Motivation

In legal practice, lawyers and law students need to quickly retrieve case precedents and analyze contract clauses. However, general AI lacks in-depth understanding of specific legal documents and is prone to hallucinations or inaccurate suggestions. This project aims to address this pain point by building a truly usable legal intelligent assistant through a complete tech stack.

## System Architecture and Workflow

**Core Components**: Llama 3.3 70B (Groq API), BAAI/bge-base-en-v1.5 embedding model, FAISS vector database, LangGraph agent framework, PyMuPDF+LangChain document loading, Gradio UI.
**Workflow**: User query → Input cleaning → Vectorization → FAISS retrieves Top4 documents → Sufficiency check → Generate structured answer if information is sufficient; expand query or mark as an out-of-knowledge-base question if insufficient.

## Knowledge Base Construction

Covers 10 typical legal scenarios (force majeure clauses, non-compete agreements, etc.). Data source strategy: real case precedents (CourtListener), real contracts (SEC EDGAR/LawInsider), synthetic documents (to supplement topics lacking public clean PDFs)—balancing authenticity and comprehensive coverage.

## Evaluation Results and Key Findings

**Performance Metrics**: Faithfulness 0.63/1.0, task success rate 100%, average response latency 1.84 seconds, total API cost $0.004, hallucination marks 2/10 (correctly identifies out-of-knowledge-base questions).
**Key Findings**: Intellectual property ownership and SaaS auto-renewal issues scored low because the knowledge base has no relevant documents. The system can honestly admit its limitations, reflecting responsible design.

## Security and Ethical Considerations

**Technical Security**: API keys stored in Colab Secrets, input cleaning to prevent prompt injection, mandatory addition of lawyer disclaimer.
**Data Compliance**: Only uses public domain/synthetic documents, no real client data, and sources are traceable.
**Usage Boundaries**: For research assistance only; legal decisions require consultation with a licensed lawyer.

## Technical Highlights and Insights

1. Value of Agentic RAG: Independently judges information sufficiency and proactively expands queries to improve answer quality; 2. Cost-effectiveness: Completes complex scenario tasks with low API costs; 3. Evaluation-driven development: LLM-based faithfulness scoring helps iteration; 4. Knowledge boundary management: Honestly marks unanswerable questions to avoid hallucinations.

## Applicable Scenarios and Limitations

**Applicable Scenarios**: Legal student case study assistance, lawyer contract clause retrieval, preliminary legal knowledge screening, legal education demonstration.
**Current Limitations**: Small knowledge base size (only 10 topics), only supports English documents, relies on Google Colab environment, insufficient support for complex multi-hop reasoning.