Zing Forum

Reading

Ollive: A Full-Stack LLM Chat Interface and Inference Log System Based on React and FastAPI

A modern full-stack LLM inference log and chat application that provides a React frontend conversation interface and a high-performance FastAPI backend, supporting reliable inference metric tracking, sensitive information desensitization, and storage functions.

LLMReactFastAPI全栈开发推理日志聊天应用ViteSQLAlchemyOllamaAI监控
Published 2026-05-24 16:10Recent activity 2026-05-24 16:30Estimated read 6 min
Ollive: A Full-Stack LLM Chat Interface and Inference Log System Based on React and FastAPI
1

Section 01

Introduction: Ollive — Full-Stack LLM Chat and Inference Log System

Ollive is a modern full-stack LLM inference log and chat application based on React and FastAPI. Its core components include an intuitive React conversation interface and a high-performance FastAPI backend. It provides reliable inference metric tracking, sensitive information desensitization, and storage functions, suitable for scenarios such as AI application development, model evaluation, and enterprise monitoring, supporting Ollama compatibility and real-time interaction.

2

Section 02

Project Background and Source

Project Source

Design Goals

Provide reliable infrastructure for tracking, desensitizing, and storing LLM inference metrics to meet the needs of AI interaction monitoring and recording.

3

Section 03

Detailed Technical Architecture

Frontend Architecture

  • Technology Selection: React 18+, Vite, Modern CSS (CSS Modules/Tailwind)
  • Functional Features: Real-time chat interface (streaming response), conversation history management, model selection configuration, inference metric visualization

Backend Architecture

  • Technology Selection: FastAPI, SQLAlchemy, PostgreSQL/SQLite, Python 3.10+
  • Core Functions: LLM API proxy, inference log recording, sensitive information desensitization, metric collection and storage, RESTful API design
4

Section 04

Core Function Analysis

1. Inference Log Recording

Record metadata of each LLM interaction (timestamp, model information, token count, parameters, latency, etc.) and persist it via SQLAlchemy, supporting SQLite (development) and PostgreSQL (production).

2. Sensitive Information Desensitization

  • PII detection (email, phone number, etc.)
  • Key protection (API keys, passwords)
  • Custom rule support

3. Real-time Chat Interface

  • ChatGPT-like conversation UI with Markdown rendering
  • Streaming response typewriter effect
  • Conversation history management
  • Separate front-end and back-end deployment (front-end port 5173, back-end port 8000)
5

Section 05

Application Scenarios and Value

AI Application Development

Rapid prototype verification, user interaction testing, inference cost analysis

Model Evaluation and Comparison

Performance comparison of different models, prompt strategy A/B testing, user feedback collection

Enterprise AI Monitoring

Audit logs, cost tracking, compliance reports

Education and Research

Structured data collection, experimental condition control, data export support

6

Section 06

Project Highlights and Current Limitations

Highlights

  • Full-stack TypeScript/Python combination, ensuring type safety and compatibility with the AI ecosystem
  • Flexible database support (SQLite/PostgreSQL)
  • Containerization-ready, compliant with the 12-factor principles

Limitations

  • Lack of user authentication and authorization system
  • Single-user design with no real-time collaboration
  • No built-in analytics dashboard
7

Section 07

Improvement Directions and Suggestions

Potential Improvements

  • Add multi-user management and role-based access control
  • Develop a built-in analytics dashboard (usage trends, cost analysis)
  • Introduce a plugin system to support multiple LLM providers
  • Add conversation/metric export functions (CSV, JSON)
  • Optimize mobile user experience