Section 01
[Introduction] Reproduction of the MIN-K% Prob Method: Detecting Membership Inference Attacks in Large Language Model Pre-training Data
This article presents a complete reproduction and extended analysis of the ICLR 2024 paper 'MIN-K% Prob Method', verifying the method's effectiveness in membership inference attacks and finding that model size and text length have significant impacts on detection quality. Focusing on the privacy and security issues of large language model pre-training data, it uses black-box analysis of model probability distributions to determine membership relationships, providing a practical tool for model auditing and privacy protection.