Section 01
SharedRequest: Core Guide to the Batch-Level Privacy-Preserving Inference Framework
SharedRequest is a batch-level privacy-preserving inference framework. It reduces query costs by 5x while protecting user prompt privacy through semantic instruction grouping and batch-level privacy preservation mechanisms, without modifying model architectures. Its core idea is to shift privacy protection from the single-prompt level to the batch level, achieving a balance between privacy, utility, and efficiency. It is applicable to various LLMs (including closed-source APIs and open-source models).