Section 01
Subtext Benchmark: Guide to Challenges and Evaluation of LLM in Identifying Misogynistic Content
Subtext is an open-source benchmarking tool based on the Inspect AI framework, aiming to evaluate the ability of large language models (LLMs) to detect misogynistic content. This project reveals the challenges of identifying implicit biases in AI content moderation, provides references for improving AI content moderation systems, and promotes the development of responsible AI.