Vector Database: FAISS
FAISS (Facebook AI Similarity Search) is an efficient similarity search library developed by Meta. It can quickly find the vectors most similar to the query among massive vectors, making it an ideal choice for building RAG systems. This project uses the CPU version of FAISS and can run without a GPU.
Embedding Model: Sentence Transformers
The project uses the all-MiniLM-L6-v2 model to generate text embeddings. This is a lightweight yet effective sentence embedding model that maps semantically similar text to adjacent vector spaces. The model is only about 80MB in size, making it perfect for local deployment.
Local LLM: Ollama + Llama3
Ollama is a tool that simplifies running local large language models. This project uses the Llama3 model, which performs inference entirely locally without network connectivity, protecting data privacy. Through carefully designed prompts, the model is ensured to answer questions only based on the provided context, avoiding hallucinations.
Interactive Interface: Streamlit
Streamlit is a Python library for quickly building data applications. This project uses it to create a clean, modern web interface with features including PDF upload, text preview, chunk statistics, context viewing, and real-time Q&A.