章节 01
Ferrum: Pure Rust High-Performance LLM Inference Engine (Main Guide)
Ferrum is a Rust-native LLM inference engine designed to address Python's runtime dependencies, performance bottlenecks, and deployment complexity. Key features include zero Python dependency, single binary deployment, support for text generation, speech recognition/synthesis, embedding vectors, OpenAI-compatible API, and hardware optimizations (CUDA/Metal). It aims to provide a lightweight, efficient alternative for LLM deployment in production and edge environments.