Zing Forum

Reading

Unhosted: An Open-Source Framework for Distributed AI Inference on Personal Hardware

Unhosted is an open-source project written in Rust, designed to enable users to run large language model inference on their own devices without relying on cloud APIs. The project proposes a three-tier trust radius architecture: Local Mode, Trusted Node Mode, and Public Swarm Mode, achieving true data privacy and computational autonomy.

分布式推理本地AIRust隐私保护开源llama.cpp去中心化
Published 2026-05-13 02:41Recent activity 2026-05-13 02:50Estimated read 5 min
Unhosted: An Open-Source Framework for Distributed AI Inference on Personal Hardware
1

Section 01

Unhosted: Introduction to the Distributed Inference Framework for Running AI on Personal Hardware

Unhosted is an open-source project written in Rust, aimed at allowing users to run large language model inference on personal devices without relying on cloud APIs. Its core philosophy is "AI that lives where you do", achieving data privacy and computational autonomy through a three-tier trust radius architecture (Local Mode, Trusted Node Mode, Public Swarm Mode).

2

Section 02

Project Background: Addressing Privacy and Dependency Issues of Cloud AI

Currently, most AI inference relies on centralized cloud services, which have privacy risks and external dependencies. The Unhosted project was born to address this; it is developed in Rust (pre-alpha stage) and, unlike other local AI solutions, supports a distributed inference cluster architecture that can combine multiple personal devices into a unified inference endpoint.

3

Section 03

Three-Tier Trust Radius Architecture: Balancing Privacy and Computing Power Needs

Local Mode

Uses only the user's own devices, no internet connection required, inference is fully done locally, data never leaves physical control—ideal for sensitive scenarios.

Trusted Node Mode

Extends to devices in the trust circle (e.g., roommate's computer, home server), with end-to-end encrypted connections, no cost—suitable for small teams or multi-device home scenarios.

Public Swarm Mode

Calls on a public swarm composed of strangers' GPUs, charged per token using USDC, with a monthly spending cap—serving as a supplementary safety net for the first two modes.

4

Section 04

Technical Implementation and Progress: Single Machine and LAN Cluster Support Already Available

Implemented features:

  • Single-machine inference (v0.0.1, wrapped based on llama.cpp, smoke test passed on M-series Macs)
  • LAN cluster (v0.0.2, request routing + local/peer node round-robin scheduling, end-to-end verification completed)
  • mDNS node discovery and pairing (v0.0.3, one-click pairing from sidebar + hot-reload routing)
  • Model management (v0.0.3, supports short names and GGUF URL pulling) Under development: VRAM pooling, trusted node pairing (v0.1.0), Public Swarm (v0.3.0), verifiable inference (research phase).
5

Section 05

Use Cases: Covering Privacy-Sensitive and Edge Computing Needs

  • Privacy-sensitive users: Run models like Llama 70B offline to protect medical, legal, or business confidential information.
  • Hardware enthusiasts: Combine multiple devices (MacBook + RTX4090 desktop) to run larger models.
  • Edge computing: Raspberry Pi clusters run lightweight models to achieve autonomous inference.
6

Section 06

Open-Source Commitment: Transparent Development and AGPL License

Unhosted uses the AGPL-3.0 license, allowing reading, forking, auditing, and deployment, but prohibits hosting it as a paid service and claiming it as self-developed. Maintainers commit to being honest about capability boundaries (current status marked in README) and will release reproducible benchmark data instead of marketing rhetoric.

7

Section 07

Summary and Outlook: A New Direction for Distributed AI Computing

Unhosted represents a reflection on the shift of AI computing models from centralized cloud services to distributed, user-controlled architectures. Although in the early stages, its technical roadmap is clear and development attitude is honest—it is a promising option for users concerned about AI privacy and breaking away from cloud dependencies, and is expected to become an important infrastructure for local AI inference in the future.