Zing Forum

Reading

Mutual Fund FAQ Assistant: An Intelligent Fund Q&A System Based on RAG

A lightweight RAG (Retrieval-Augmented Generation) system designed to provide factual Q&A services for ICICI Prudential mutual fund schemes, combining vector search and LLM to enable accurate and traceable investment information queries.

RAGMutual FundFAQLLMChromaDBFastAPIReactVector SearchFinancial AIICICI Prudential
Published 2026-06-05 12:14Recent activity 2026-06-05 12:50Estimated read 7 min
Mutual Fund FAQ Assistant: An Intelligent Fund Q&A System Based on RAG
1

Section 01

[Introduction] Mutual Fund FAQ Assistant: Core Introduction to the RAG-Based Intelligent Fund Q&A System

This article introduces the Mutual Fund FAQ Assistant, a lightweight RAG (Retrieval-Augmented Generation) system designed to provide factual Q&A services for ICICI Prudential mutual fund schemes. It combines vector search and LLM technologies to enable accurate and traceable investment information queries, solving the problems of low efficiency in ordinary investors' access to fund information and the tendency of general LLMs to generate hallucinations. It strictly limits itself to providing factual information and does not offer investment advice. The project is maintained by bhavyaamahajann and was released on the GitHub platform on June 5, 2026.

2

Section 02

Project Background and Motivation

In the field of financial investment, mutual funds are important tools for investors' asset allocation. However, the complex product information, fee structures, and other details make it difficult for ordinary investors to quickly obtain accurate information. Traditional FAQ document retrieval is inefficient, and general large language models tend to generate hallucinations and provide inaccurate suggestions. Therefore, the Mutual Fund FAQ Assistant project adopts the RAG architecture to ensure that answers are based on official and credible sources and only provide factual information, while maintaining a natural language Q&A experience.

3

Section 03

System Architecture and Tech Stack

The project adopts a full-stack architecture, with technology selection for each layer balancing performance and efficiency:

  • Frontend Layer: React + Vite (for rapid development), Vanilla CSS (with a warm café light theme);
  • Backend Layer: FastAPI (high-performance asynchronous web framework);
  • Intelligent Retrieval Layer: Embedding model BAAI/bge-large-en-v1.5 (local HuggingFace), vector database ChromaDB (local persistence);
  • LLM Layer: Groq LLaMA-3.3-70b-versatile (accessed via API for fast response);
  • Automation & Operations: GitHub Actions for daily scheduled updates of fund data.
4

Section 04

Supported Fund Schemes

The system currently covers 15 mainstream funds of ICICI Prudential Mutual Fund, categorized as follows:

  • By Market Cap: Large-cap, mid-cap, small-cap, large+mid-cap hybrid, flexible market cap, multi-cap, and concentrated holding strategy funds;
  • Hybrid & Balanced: Equity savings, equity-debt balanced, fixed-term savings, and multi-asset allocation funds;
  • Special Types: Tax-saving, Nifty50 index tracking, gold ETF link, and silver ETF link funds.
5

Section 05

RAG Workflow Analysis

The core RAG architecture ensures accurate and traceable answers:

  1. Data Ingestion: Crawl 15 official URL documents → vectorize using the BAAI model → store in ChromaDB → daily updates via GitHub Actions;
  2. Query Processing: User submits a question → vectorize → perform semantic similarity retrieval of relevant fragments in ChromaDB;
  3. Answer Generation: Generate an answer (≤3 sentences) using context + LLaMA-3.3-70b → automatically attach source references;
  4. Safety Constraints: Only provide factual information, do not constitute investment advice, and avoid hallucinations.
6

Section 06

Deployment and Usage Guide

Local deployment steps:

  1. Clone the repository: git clone https://github.com/bhavyaamahajann/Mutual_Fund_FAQ_Assistant.git;
  2. Environment preparation: Create a venv → activate it → install dependencies from backend/requirements.txt;
  3. Configuration and startup: Copy the environment variable template and set the Groq API key → run the data ingestion script → build the frontend → start the FastAPI server;
  4. Access: Use the system at http://localhost:8000, or choose the Streamlit alternative interface.
7

Section 07

Project Value and Insights

The project's value is reflected in:

  1. Trustworthy Q&A: Limited sources to avoid LLM hallucinations;
  2. Cost Optimization: Open-source embedding models + lightweight vector databases reduce deployment costs;
  3. Compliance-Friendly: Distinguish between facts and investment opinions, complying with financial regulations;
  4. Extensible: Modular design facilitates integration of more data sources or switching LLMs. Insights for developers: This project provides a clear reference for vertical domain (e.g., finance, medical) Q&A systems, and the RAG architecture is particularly valuable in scenarios requiring high accuracy.