Section 01
[Introduction] Expert Evaluation Study of LLM Open Legal Reasoning Capabilities from the Perspective of the Japanese Bar Exam
This study constructed the first LLM open reasoning evaluation dataset for the Japanese legal domain, using Japanese Bar Exam writing tasks as the scenario. Through manual evaluation by legal experts, it reveals the limitations of current large models in legal reasoning (such as incomplete problem identification, loose argument structure, etc.) and hallucination issues (fictional precedents, incorrect citation of legal provisions, etc.). It fills the gap in AI capability evaluation across legal traditions and provides references for the safe and reliable development of legal AI.