Section 01
Kiln: Introduction to the Single-Model LLM Inference Server Supporting Real-Time Online Learning
Kiln is an innovative LLM inference server. It enables parallel training and serving via LoRA hot-swapping technology, allowing models to perform real-time fine-tuning while continuously providing services. This solves the dilemma of separated training and deployment in traditional model services and supports continuous learning.