Section 01
[Introduction] atomr-infer: A Unified Heterogeneous LLM Inference Runtime Based on the Actor Model
atomr-infer is a Rust-implemented unified abstraction layer for LLM inference. Its core integrates local GPU runtimes (e.g., vLLM, TensorRT-LLM) and remote APIs (e.g., OpenAI, Anthropic) into a single interface via the Actor model, solving system fragmentation issues in heterogeneous inference scenarios. It supports flexible scaling from pure remote deployment with zero GPU dependency to heterogeneous clusters, providing developers with a unified mental model and stable system capabilities.