Section 01
Introduction: ConjFormer—A New Solution for Privacy-Preserving LLM Inference
Addressing the privacy leakage risks in large language model (LLM) inference, the research team proposes the ConjFormer architecture. Through orthogonal obfuscation and O(d) equivariance design, it reduces the token recovery rate from 35% to 1.3% without introducing noise or re-encryption, enabling efficient and practical privacy-preserving inference. This solution balances privacy and performance, providing a new path for cloud-based LLM inference.