Zing Forum

Reading

ATLAS: A RAG System Evaluation and Testing Framework for Humanities and Social Sciences Research

ATLAS, launched by the AI as Infrastructure project, is an LLM RAG system evaluation and testing framework specifically designed for the Humanities and Social Sciences (HASS) research field. It supports hybrid search, multiple LLM backends, and replaceable corpora.

RAGLLM人文社科历史研究检索增强生成向量数据库混合搜索BM25ChromaDBFastAPI
Published 2026-04-13 18:13Recent activity 2026-04-13 18:18Estimated read 6 min
ATLAS: A RAG System Evaluation and Testing Framework for Humanities and Social Sciences Research
1

Section 01

[Introduction] ATLAS: A RAG System Evaluation Framework Exclusive to Humanities and Social Sciences Research

ATLAS is an LLM RAG system evaluation and testing framework launched by the AI as Infrastructure project at the Australian National University, specifically designed for the Humanities and Social Sciences (HASS) research field. It supports hybrid search, multiple LLM backends, and replaceable corpora, aiming to address the pain point that general RAG evaluation methods struggle to meet the unique research needs of HASS.

2

Section 02

Project Background and Positioning

ATLAS stands for "Analysis and Testing of Language Models for Archival Systems" and is one of the core deliverables of the AIINFRA project. Its goal is to develop an LLM RAG evaluation framework for historical research scenarios. Unlike general RAG tools, it fully considers the specificity of HASS: it needs to handle large volumes of unstructured text (such as historical documents and parliamentary records) and has extremely high requirements for retrieval accuracy and interpretability.

3

Section 03

Core Technical Architecture

Backend Tech Stack

Based on Python 3.10 + FastAPI (a high-performance asynchronous framework, verified by 30 concurrent user load tests), the vector database uses Chroma DB to support efficient similarity search.

Frontend Tech Stack

Vue3 + Vite combination, with Node.js version 22.14.0 locked via .nvmrc to ensure environment consistency.

Optional Components

Integrates OpenTelemetry (observability framework) and Phoenix Arize (LLM evaluation observability).

4

Section 04

Detailed Explanation of Hybrid Search Mechanism

The core highlight of ATLAS is hybrid search: combining BM25 lexical retrieval (exact keyword matching) and dense vector retrieval (semantic understanding), with results fused via the RRF algorithm. RRF does not require training data; it ranks results using weighted reciprocal summation, balancing precision and semantic depth, and addresses the shortcomings of single retrieval methods (BM25's weak semantic capability and dense retrieval's tendency to miss key terms).

5

Section 05

Corpus Replaceability Design

By default, it provides vector storage for the 1901 Australian, British, and American parliamentary debate records (Hansard), and supports custom corpus replacement:

  1. make vs generates vector storage (CPU/GPU modes; GPU mode is optimized for CUDA 12.8 by default);
  2. make r generates a compatible retriever;
  3. Template scripts in the create/ directory adapt to new corpora (novels, newspapers, etc.). This design extends to various HASS research fields.
6

Section 06

Authentication and Deployment Support

  • Authentication: AWS Cognito user authentication;
  • Deployment: Makefile commands cover the entire lifecycle (development server startup, local Staging/production environment deployment/deletion, Cloudflare tunnel deployment);
  • Acceleration: Optional NVIDIA GPU to improve embedding generation performance via Sentence Transformers.
7

Section 07

Practical Application Scenarios and Significance

Traditional historical research relies on manual review which is inefficient. General RAG systems handling historical documents face issues like language evolution, proper noun variations, and context dependency. ATLAS provides solutions through customized vector storage and hybrid search, helping researchers quickly locate documents and improve research efficiency.

8

Section 08

Conclusion and Outlook

ATLAS is an important direction for RAG to deepen into vertical fields. As an evaluation framework, it helps improve the performance of LLMs in historical research. The project is under active development (with AI programming support) to provide digital humanities and history researchers with an out-of-the-box evaluation platform and a foundation for customized retrieval systems.