Section 01
OlliteRT: Android Phone Becomes Local LLM Inference Server (Introduction)
OlliteRT is an innovative open-source Android app built on Google's LiteRT runtime, which can turn an Android phone into an OpenAI-compatible local LLM inference server. It supports multimodal inference, tool calling, and streaming responses, allowing models like Gemma and Qwen to run without cloud connectivity—protecting user privacy and lowering the hardware barrier for AI applications.