# ARC AI: A Local-First Legal Assistant Reshaping Maryland Housing Rental Consultation with RAG and Multi-Model Architecture

> ARC AI is a fully locally-run Retrieval-Augmented Generation (RAG) system designed specifically to answer Maryland housing rental legal questions. It integrates 9 NLP analysis techniques, supports multi-model switching, and provides accurate legal information with transparent citations for both tenants and landlords.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-26T00:44:10.000Z
- 最近活动: 2026-04-26T00:49:11.558Z
- 热度: 154.9
- 关键词: RAG, 检索增强生成, 法律AI, 本地LLM, Ollama, ChromaDB, NLP分析, 住房租赁, 马里兰州法律, 多模型支持
- 页面链接: https://www.zingnex.cn/en/forum/thread/arc-ai-rag
- Canonical: https://www.zingnex.cn/forum/thread/arc-ai-rag
- Markdown 来源: floors_fallback

---

## ARC AI Introduction: A Local-First Maryland Housing Rental Legal Assistant

ARC AI is a fully locally-run Retrieval-Augmented Generation (RAG) system designed specifically for Maryland housing rental legal issues. It integrates 9 NLP analysis techniques, supports multi-model switching, deploys open-source LLMs via Ollama, and processes all data locally to ensure privacy. The system's answers are based on real official legal documents with transparent citations, providing accurate and reliable legal consultation for both tenants and landlords, addressing the pain points of high traditional legal service costs and the tendency of general AI to generate hallucinations.

## Project Background and Core Positioning

Maryland's housing rental laws involve multi-layered state and county-level regulations, making it difficult for ordinary users to fully understand their rights and obligations. Traditional legal services are costly, and general AI chatbots tend to generate misinformation.

ARC AI is designed with the concept of "local-first, transparent and trustworthy": the system runs on the user's local machine, deploys models via Ollama, and processes data locally to protect privacy; it uses a RAG architecture, with answers based on real Maryland official legal documents and accompanied by inline citations, allowing traceability of information sources.

## Technical Architecture and Multi-Model Support

### RAG Pipeline
After the user's question is vectorized by MiniLM, relevant legal fragments are retrieved via cosine matching in the ChromaDB vector database (384-dimensional embeddings, persistent storage). The question and retrieved fragments are injected into a prompt template and sent to the local LLM to generate an answer. The document chunking strategy uses 500 tokens per chunk with a 75-token overlap to ensure contextual coherence.

### Multi-Model Switching
The default model is Llama 3.1 8B, and users can switch to Mistral 7B or Qwen 2.5 7B. This is implemented via a FastAPI backend, with front-end dropdown menu operation. All models are deployed locally without the need for API keys.

### 9 NLP Analysis Techniques
It integrates 9 NLP analysis techniques: intent classification (BART-large-MNLI), named entity recognition (spaCy + regex), topic modeling (LDA), extractive question answering (RoBERTa-base-SQuAD2), sentiment analysis (VADER + RoBERTa), text summarization (BART-large-CNN), keyword extraction (KeyBERT), readability scoring (Flesch-Kincaid), and emotion detection (distilRoBERTa). Users can expand the analysis panel to gain in-depth insights.

## Data Sources: Coverage of State and County-Level Official Regulations

ARC AI's data sources cover Maryland state and major county-level official resources:
- Maryland Attorney General's Office Landlord-Tenant Dispute Guide
- State Department of Housing and Community Development (DHCD) Tenant and Landlord Affairs Information
- Montgomery County DHCA County-Level Manual and Tenant Rights Explanations
- Baltimore County Circuit Court Law Library Resources
- Baltimore City DHCD Renter Resources
- Prince George's County DHCD Tenant Resources and 2024 Rent Stabilization Act
- People's Law Library Community Legal Resources

The crawler uses a two-hop crawling strategy with browser User-Agent and keyword filtering to capture complete official content as much as possible.

## User Experience Design: Professional and Approachable Interface

The front-end uses a beige/terracotta color scheme, paired with Fraunces, Inter, and JetBrains Mono fonts, to create a professional and approachable visual experience. Core interactive features:
- Streaming token-by-token answer display, simulating natural conversation
- Answers include inline citation markers like [S1][S2], with clickable source capsules below to jump to the original legal documents
- Conversation history review and "New Conversation" quick reset function
- Expandable NLP analysis panel that does not interfere with the main interface

The design fully considers the needs of legal consultation scenarios, balancing professionalism and ease of use.

## Limitations and Future Outlook

### Current Limitations
1. Coverage: Some state-level websites cannot be fully crawled due to heavy JS or Cloudflare protection
2. Timeliness: Uses point-in-time snapshots; laws change annually, requiring regular re-crawling
3. Model Hallucinations: Despite RAG grounding, misleading rewrites may still occur, which need to be mitigated with extractive QA
4. Hardware Requirements: Llama3.1 8B requires approximately 5GB of memory, with pure CPU inference taking 10-30 seconds

### Future Directions
- Implement county-specific retrieval routing based on user location
- Fine-tune models on Maryland legal Q&A pairs using QLoRA
- Support users to upload PDF lease documents and answer questions combining state laws
- Provide cloud deployment options (Render/Fly.io + RunPod hosting for Ollama)

## Insights from AI Legal Assistants and Tech for Good Practices

ARC AI provides a reference for AI legal applications: through RAG architecture and local deployment, it provides accurate and traceable information while protecting privacy; multi-model support and NLP analysis panels balance technical depth and user experience. Its open-source code and documentation provide developers with complete technical stack learning resources.

The project embodies the concept of "Tech for Good": it lowers the threshold for legal services, allowing ordinary tenants and landlords to access credible legal information, and sets an example for AI application development that emphasizes privacy, transparency, and social value.
