With the booming development of the large language model (LLM) ecosystem, developers face an awkward reality: every time they integrate a new model provider, they have to repeatedly write similar HTTP client code. OpenAI, Anthropic, Google, Cohere, locally deployed vLLM... Each endpoint has subtle differences—different authentication methods, different request formats, different error codes, different streaming response protocols.
This repetitive work not only wastes time but also introduces inconsistencies. Having multiple LLM clients with distinct styles in one project means doubled maintenance costs and accumulated security risks. When needing to switch models or add new providers, developers often have to modify various parts of the codebase.
floship-llm was born to solve this pain point. It is a reusable LLM client library that provides a unified, robust, production-ready interface abstraction for OpenAI-compatible inference endpoints.