Section 01
LLM Jailbreak Research Guide: A Security Exploration of Adversarial Prompting and Jailbreak Attacks
This research focuses on adversarial prompting and jailbreak attacks against large language models (LLMs), systematically exploring the security boundaries and protection mechanisms of LLMs. It covers core areas such as red team testing, safety alignment evaluation, and iterative defense mechanisms, aiming to enhance the security and robustness of LLMs through the approach of 'using offense to promote defense'.