With the popularization of the Machine Learning as a Service (MLaaS) model, more and more enterprises and research institutions provide model inference capabilities through API interfaces. While this model brings convenience, it also introduces new security threats—Model Extraction Attacks.
Attackers can collect input-output sample pairs by querying the target API in large quantities, then use these samples to train a substitute model with similar functions. The harms of such attacks include:
- Intellectual Property Loss: The model itself may represent the core competitiveness of an enterprise
- Privacy Leakage Risk: The model may encode sensitive information from training data
- Adversarial Sample Transfer: Stolen models can be used to generate adversarial samples to attack the original service
- Bypassing Security Restrictions: Attackers can test attack strategies on local copies
Therefore, developing effective model stealing detection and defense mechanisms is crucial for protecting the security of machine learning systems.