Reading

JiraiBench: A Bilingual Large Model Evaluation Benchmark for Self-Harm Behavior Detection in Jirai Subculture Communities

JiraiBench is the first bilingual evaluation benchmark specifically designed for detecting self-harm content in Jirai subculture communities, providing a standardized test set to assess the ability of large language models to identify potential mental health risk content.

大语言模型自伤行为检测地雷系心理健康内容审核双语评测亚文化AI伦理

Published 2026-04-13 12:14Recent activity 2026-04-13 12:20Estimated read 7 min

JiraiBench: A Bilingual Large Model Evaluation Benchmark for Self-Harm Behavior Detection in Jirai Subculture Communities

Section 01

Introduction: JiraiBench—the First Bilingual Evaluation Benchmark for Self-Harm Behavior Detection in Jirai Subculture Communities

JiraiBench is the first bilingual (Chinese and Japanese) evaluation benchmark specifically for detecting self-harm content in Jirai subculture communities. It aims to provide a standardized test set to assess the ability of large language models to identify potential mental health risk content, filling the gap in the lack of systematic evaluation standards for traditional moderation systems and existing large models in this field.

Section 02

Background and Motivation: Content Moderation Challenges Brought by Jirai Subculture

In recent years, the "Jirai" subculture originating from Japan has spread rapidly among young people in East Asia. Its dark and decadent aesthetic is often accompanied by expressions of self-harm and depression themes. With the expansion of related communities, identifying potential self-harm content has become an important issue for mental health intervention and platform governance. Traditional moderation systems struggle to accurately identify such implicit and contextual expressions, and there is a lack of systematic evaluation standards for the detection ability of large models in the face of its unique language style and cultural background—thus the JiraiBench project was born.

Section 03

Project Overview: Core Positioning and Goals of JiraiBench

JiraiBench is a bilingual (Chinese-Japanese) evaluation benchmark dataset, collected from real social media and professionally annotated, covering various expression forms under Jirai culture (implicit hints, direct statements, subcultural terms, etc.). Its core goal is to establish a standardized testing framework to help researchers and developers understand the performance of large models in handling sensitive content, identify blind spots, and promote the development of precise and culturally sensitive content detection technologies.

Section 04

Dataset Features: Bilingual, Real-Scene, and Culturally Sensitive Annotation Design

JiraiBench dataset features include:

Bilingual Coverage: Includes Chinese and Japanese samples, reflecting cross-language transmission characteristics and testing cross-language transfer effects;
Real-Scene Data: Collected from real social platforms, retaining original language styles, internet slang, and subcultural expressions;
Fine-Grained Annotation: Annotates dimensions such as whether the content contains self-harm behavior, its severity, and the directness of expression;
Culturally Contextual Sensitivity: Distinguishes between pure stylistic expressions and real risk signals to avoid misjudgment from keyword matching.

Section 05

Evaluation Methodology: Multi-Dimensional Assessment of Model Capabilities

JiraiBench adopts a multi-dimensional evaluation framework, focusing on:

Balance Between Recall and Precision: Weigh the consequences of missed detections (false negatives) and false alarms (false positives);
Cross-Language Consistency: Evaluate the consistency of model performance on Chinese and Japanese samples;
Implicit Expression Recognition: Test the model's understanding of metaphorical and symbolic self-harm content;
Cultural Adaptability: Examine the degree of understanding of Jirai-specific terms, symbols, and cultural backgrounds.

Section 06

Application Value: Multiple Significance for Academia, Industry, and Social Welfare

The release of JiraiBench has multiple meanings:

Academic Research: Provides a standardized tool for interdisciplinary research between mental health and NLP, promoting reproducible research;
Industry: Serves as a test set for content safety systems, helping platforms optimize Jirai content moderation strategies;
Model Developers: Offers a capability diagnosis tool to guide model optimization;
Social Welfare: Improves the accuracy of risk content identification, providing earlier intervention opportunities for young people in psychological distress.

Section 07

Limitations and Future Directions: Paths for Continuous Optimization

Limitations of JiraiBench: It mainly covers Chinese and Japanese contexts, and the applicability to other languages needs to be verified; the evolution of Jirai culture requires attention to the timeliness of the dataset. Future directions: Expand language coverage, establish a dynamic update mechanism, develop fine-grained risk assessment models, and explore human-machine collaborative moderation models.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15