Zing Forum

Reading

FuseFSS: Efficient and Secure Large Language Model Inference Based on Function Secret Sharing

FuseFSS replaces operator-by-operator protocol design with a unified compilation pipeline, achieving 1.24-1.50x end-to-end acceleration while maintaining accuracy, and significantly reducing communication overhead and preprocessing costs.

大语言模型安全推理函数秘密共享隐私计算多方安全计算FSSGPU加速定点数运算
Published 2026-06-08 22:30Recent activity 2026-06-09 10:51Estimated read 6 min
FuseFSS: Efficient and Secure Large Language Model Inference Based on Function Secret Sharing
1

Section 01

[Introduction] FuseFSS: Core Innovations in Efficient and Secure LLM Inference Based on Function Secret Sharing

FuseFSS replaces operator-by-operator protocol design with a unified compilation pipeline, solving the fragmentation problem of non-linear operations in function secret sharing (FSS)-based secure inference systems. It achieves 1.24-1.50x end-to-end acceleration while maintaining accuracy, and significantly reduces communication overhead and preprocessing costs. This article will discuss aspects including background, methods, performance, and implementation.

2

Section 02

Background: Challenges in Secure Inference and Current State of FSS Technology

Background of Privacy Computing

As LLM capabilities improve, the conflict between protecting user sensitive data and model weight privacy has become prominent. The two-server secure inference architecture emerged, allowing multi-party collaboration while keeping data private.

Current State of FSS Technology

As a cryptographic primitive, FSS can efficiently handle linear layer operations, but fixed-point non-linear operations (such as ReLU, GELU) face performance bottlenecks due to fragmented design (each operator has a dedicated protocol), leading to issues like code duplication and optimization difficulties.

3

Section 03

Methodology: Innovation of FuseFSS's Unified Compilation Pipeline

FuseFSS replaces operator-by-operator protocols with a unified compilation pipeline:

  1. Core Design: Define a general operator description format (interval partitioning, low-degree arithmetic fragments, predicate bits);
  2. Compiler Output:
    • Packed Comparison: Merge multiple interval boundary comparisons to reduce communication rounds;
    • Vector Interval Lookup: FSS-based secure table lookup to optimize arithmetic operations.
4

Section 04

Evidence: Quantitative Analysis of FuseFSS's Performance Improvement

Experimental results show:

  • End-to-end Acceleration: 1.24-1.50x (accuracy maintained);
  • Communication Overhead: Online communication volume reduced by 9%-16%;
  • Preprocessing Optimization: Key generation time reduced by 14%-23%, key size shrunk by 20%-24%.
5

Section 05

Technical Implementation Details: Fixed-Point and Batch Processing Optimization

Fixed-Point Operation Handling

For fixed-point optimization, map to integer operations, balance accuracy and overhead through intelligent interval partitioning and coefficient selection;

Batch Processing Strategy

Automatically pack multi-element operations to amortize the fixed cost of FSS evaluation;

Compatibility

The generated FSS evaluation can be integrated into existing FSS libraries without rewriting the underlying cryptographic implementation.

6

Section 06

Application Scenarios: Privacy Protection and Cross-Organization Collaboration

  1. Privacy-Preserving Inference Services: Suitable for sensitive fields such as healthcare and finance;
  2. Model-as-a-Service (MaaS) Enhancement: Protect intellectual property rights of model weights;
  3. Cross-Organization Collaboration: Support scenarios like joint risk control and cross-institutional medical research.
7

Section 07

Limitations and Future Work Directions

Current Limitations

  • Limited operator coverage (mainly for common activation functions);
  • Experiments focused on BERT/GPT-style models; ultra-large-scale models need exploration;
  • GPU optimization is not directly applicable to other accelerators;

Future Directions

Expand operators and model architectures, hybrid TEE solutions, accuracy-performance trade-off tools, support for dynamic model updates.

8

Section 08

Conclusion: Significance and Prospects of FuseFSS

FuseFSS solves the fragmentation problem of FSS secure inference through a unified compilation pipeline, bringing significant performance improvements and providing a scalable and maintainable architectural paradigm. As privacy computing becomes increasingly important today, it provides key infrastructure for building trusted AI systems and is expected to promote the implementation of more privacy-preserving LLM applications in the future.