# Ollive: A Full-Stack LLM Application Platform Integrating Multi-Turn Dialogue and Inference Observability

> Ollive is an open-source full-stack LLM chat application that not only provides streaming multi-turn dialogue functionality but also incorporates a complete inference observability infrastructure. It automatically captures metadata for each model call via an SDK, asynchronously writes it to PostgreSQL through Redis Streams event streams, and offers developers a real-time monitoring dashboard with key metrics such as latency, throughput, error rate, and token consumption.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-24T12:44:07.000Z
- 最近活动: 2026-05-24T12:49:35.409Z
- 热度: 160.9
- 关键词: LLM, observability, chatbot, Redis Streams, PostgreSQL, TypeScript, React, Docker, Gemini, Claude, telemetry, streaming, SSE
- 页面链接: https://www.zingnex.cn/en/forum/thread/ollive-llm-bec55e4a
- Canonical: https://www.zingnex.cn/forum/thread/ollive-llm-bec55e4a
- Markdown 来源: floors_fallback

---

## [Introduction] Ollive: A Full-Stack LLM Application Platform Integrating Multi-Turn Dialogue and Inference Observability

Ollive is an open-source full-stack LLM chat application whose core feature lies in the deep integration of streaming multi-turn dialogue functionality and a complete inference observability infrastructure. It automatically captures model call metadata via an SDK, asynchronously writes it to PostgreSQL through Redis Streams, and provides developers with real-time monitoring of key metrics such as latency, throughput, error rate, and token consumption. The system adopts a modular architecture, supports one-click Docker startup, and balances user experience with developers' observability needs.

## Project Background and Overview

**Original Author/Maintainer**: ankan17
**Source Platform**: GitHub
**Release Date**: 2026-05-24
**Original Link**: https://github.com/ankan17/chatbot-ollive

As a full-stack AI chat application, Ollive's core design principle is to decouple product functionality (streaming dialogue) from platform observability. Users can enjoy smooth multi-turn dialogue, while developers obtain inference telemetry data through the built-in SDK to gain comprehensive insights into the application's operational status. The system includes a PostgreSQL database, Redis event streams, Express API, data ingestion process, and React frontend. Except for the model API key, it can be fully run locally via `docker compose up`.

## Architecture Design and Tech Stack

Ollive's architecture follows the principle of separating transactional state and telemetry data:
- **Chat messages**: Synchronously written to PostgreSQL to ensure strong consistency;
- **Inference logs**: Asynchronous path (SDK capture → Redis Streams → ingestion process → PostgreSQL) that does not affect the main dialogue path.

The tech stack is managed using a pnpm monorepo:
- **Application layer**: web (Vite+React), api (Express), ingestion-worker (Redis consumer);
- **Shared packages**: llm-sdk (multi-model adapter), db (Drizzle ORM), shared (type definitions);
- **Infrastructure**: Docker Compose for one-click environment startup.

## Detailed Explanation of Key Design Decisions

1. **Database Schema**:
   - The message table uses a `sequence` field to ensure order and idempotency;
   - The inference log table uses a foreign key with `ON DELETE SET NULL` to retain audit data;
   - Mixes strongly typed columns and JSONB for metadata storage, balancing performance and scalability.

2. **Multi-provider Abstraction**: Compatible with Gemini/Claude via the `LLMProvider` interface; adding a new model only requires implementing an adapter.

3. **Request Cancellation**: Supports streaming dialogue interruption, saves partial responses, and records cancellation events.

4. **Guest Mode**: Anonymous dialogues are stored in the browser; imported to the server after login; trial limits are enforced via Redis counters.

## Technical Trade-offs and Production Considerations

**Pragmatic Choices**:
- Use Redis Streams instead of Kafka to reduce operational costs;
- Adopt dual processes (API + ingestion) instead of microservices to avoid over-splitting;
- Choose Vercel AI SDK over LangChain to adapt to current scenario needs.

**Production Limitations**:
- Automatically runs database migrations on startup (Drizzle locks ensure safety);
- Relies on upstream TLS termination; Compose provides HTTP services;
- Real-time dialogue requires a valid model API key; E2E tests cover other functions.

## Future Development Directions

Project planned improvements:
1. **Code Quality**: Add manual reviews to optimize AI-generated code;
2. **Resumable Streams**: Implement stream recovery after connection interruption to enhance user experience;
3. **Distributed Rate Limiting**: Migrate to Redis-supported rate limiters to adapt to horizontal scaling;
4. **Dashboard Expansion**: Adopt time partitioning and precomputed summaries to handle log growth;
5. Other features: Message editing, session branching, offline caching, RAG integration, etc.

## Summary and Value

Ollive is not just a chat application but a complete LLM application observability platform. Its clear separation of product functionality and infrastructure provides a reference architecture for LLM application development. For developers, its SDK design, event-driven pipeline, and database schema all offer actionable practical experience for inference monitoring functionality.
