Section 01
[Introduction] QRAF: A High-Performance Local LLM Inference Runtime Built for Apple Silicon
QRAF is a local LLM inference runtime written in C++. It is deeply optimized for Apple Silicon, supports conversion from HuggingFace, GGUF, and Safetensors formats, and provides a lightweight, high-performance local inference solution that balances efficiency and privacy protection.