Section 01
BloomBee: Decentralized LLM Inference & Fine-tuning System (Introduction)
BloomBee is a decentralized offline LLM service framework based on P2P networks. It uses technologies like tensor offloading, speculative decoding, and lossless compression to enable ordinary GPUs to collaboratively run ultra-large models. Key keywords: decentralized AI, LLM inference, P2P network, distributed training, GPU offloading, open-source models, BloomBee. Source: GitHub repo by ai-decentralized organization, with related paper arXiv:2604.21072 (published April 2026).