Reading

Unhosted: An Open-Source Framework for Distributed AI Inference on Personal Hardware

Unhosted is an open-source project written in Rust, designed to enable users to run large language model inference on their own devices without relying on cloud APIs. The project proposes a three-tier trust radius architecture: Local Mode, Trusted Node Mode, and Public Swarm Mode, achieving true data privacy and computational autonomy.

分布式推理本地AIRust隐私保护开源llama.cpp去中心化

Published 2026-05-13 02:41Recent activity 2026-05-13 02:50Estimated read 5 min

Unhosted: An Open-Source Framework for Distributed AI Inference on Personal Hardware

Section 01

Unhosted: Introduction to the Distributed Inference Framework for Running AI on Personal Hardware

Unhosted is an open-source project written in Rust, aimed at allowing users to run large language model inference on personal devices without relying on cloud APIs. Its core philosophy is "AI that lives where you do", achieving data privacy and computational autonomy through a three-tier trust radius architecture (Local Mode, Trusted Node Mode, Public Swarm Mode).

Section 02

Project Background: Addressing Privacy and Dependency Issues of Cloud AI

Currently, most AI inference relies on centralized cloud services, which have privacy risks and external dependencies. The Unhosted project was born to address this; it is developed in Rust (pre-alpha stage) and, unlike other local AI solutions, supports a distributed inference cluster architecture that can combine multiple personal devices into a unified inference endpoint.

Section 03

Three-Tier Trust Radius Architecture: Balancing Privacy and Computing Power Needs

Local Mode

Uses only the user's own devices, no internet connection required, inference is fully done locally, data never leaves physical control—ideal for sensitive scenarios.

Trusted Node Mode

Extends to devices in the trust circle (e.g., roommate's computer, home server), with end-to-end encrypted connections, no cost—suitable for small teams or multi-device home scenarios.

Public Swarm Mode

Calls on a public swarm composed of strangers' GPUs, charged per token using USDC, with a monthly spending cap—serving as a supplementary safety net for the first two modes.

Section 04

Technical Implementation and Progress: Single Machine and LAN Cluster Support Already Available

Implemented features:

Single-machine inference (v0.0.1, wrapped based on llama.cpp, smoke test passed on M-series Macs)
LAN cluster (v0.0.2, request routing + local/peer node round-robin scheduling, end-to-end verification completed)
mDNS node discovery and pairing (v0.0.3, one-click pairing from sidebar + hot-reload routing)
Model management (v0.0.3, supports short names and GGUF URL pulling) Under development: VRAM pooling, trusted node pairing (v0.1.0), Public Swarm (v0.3.0), verifiable inference (research phase).

Section 05

Use Cases: Covering Privacy-Sensitive and Edge Computing Needs

Privacy-sensitive users: Run models like Llama 70B offline to protect medical, legal, or business confidential information.
Hardware enthusiasts: Combine multiple devices (MacBook + RTX4090 desktop) to run larger models.
Edge computing: Raspberry Pi clusters run lightweight models to achieve autonomous inference.

Section 06

Open-Source Commitment: Transparent Development and AGPL License

Unhosted uses the AGPL-3.0 license, allowing reading, forking, auditing, and deployment, but prohibits hosting it as a paid service and claiming it as self-developed. Maintainers commit to being honest about capability boundaries (current status marked in README) and will release reproducible benchmark data instead of marketing rhetoric.

Section 07

Summary and Outlook: A New Direction for Distributed AI Computing

Unhosted represents a reflection on the shift of AI computing models from centralized cloud services to distributed, user-controlled architectures. Although in the early stages, its technical roadmap is clear and development attitude is honest—it is a promising option for users concerned about AI privacy and breaking away from cloud dependencies, and is expected to become an important infrastructure for local AI inference in the future.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15