Reading

COLLT: A Clarification-Driven Tool Learning Framework for Legal Large Language Models

COLLT is a clarification-oriented tool learning framework designed specifically for Chinese online legal services. It addresses the issue of reduced answer quality caused by incomplete information in user legal consultations through an intelligent clarification mechanism combined with six professional legal tools.

法律AI大语言模型工具学习澄清机制法律检索COLLTLawformer中文法律

Published 2026-05-22 00:40Recent activity 2026-05-22 00:49Estimated read 8 min

COLLT: A Clarification-Driven Tool Learning Framework for Legal Large Language Models

Section 01

[Introduction] COLLT Framework: An Intelligent Tool Learning Solution to Address Information Gaps in Legal Consultations

COLLT is a clarification-oriented tool learning framework designed specifically for Chinese online legal services, focusing on solving the problem of reduced answer quality due to incomplete information in user legal consultations. The framework combines an intelligent clarification mechanism (using <CLR> for clarification and <DRT> for direct response) with six professional legal tools, enabling large models to identify information gaps and proactively clarify. It also supports mainstream Chinese large models like ChatGLM3-6B, and all datasets and code are open-sourced.

Section 02

Background: Information Gaps in Legal Consultations and the Birth of COLLT

In Chinese online legal service scenarios, users often ask questions in vague or incomplete ways (e.g., only asking 'What should I do if I'm beaten?' without key information like injury status or location), making it difficult for traditional large models to provide accurate, legally grounded answers. The COLLT framework is developed based on the characteristics of Chinese legal scenarios, aiming to enable large models to learn the intelligent judgment ability of 'asking clearly before answering'.

Section 03

Core Mechanism: Decision Logic for Intelligent Clarification and Direct Response

COLLT introduces two key action markers:

<CLR> (Clarification): Proactively initiates a clarification dialogue when key information is missing
<DRT> (Direct Response): Directly enters the tool retrieval and response process when information is sufficient This decision mechanism is implemented through supervised fine-tuning, requiring the model to understand the differentiated information completeness requirements across different legal domains (criminal, civil, labor, etc.).

Section 04

Six Legal Tools Matrix: Covering Full-Spectrum Legal Information Retrieval

COLLT integrates six professional legal tools based on Lawformer:

Legal Article Retrieval (T_LAS)：Automatically retrieves applicable legal articles; training data comes from a subset of DISC-Law-SFT (excluding CAIL2018 to prevent data leakage)
Legal Charge Prediction (T_LCP)：Predicts criminal charges based on case details; integrates relevant data from DISC-Law-SFT
Similar Case Retrieval (T_SCR)：Retrieves similar cases from the case database; uses CAIL2019-SCM data
Legal Element Recognition (T_LER)：Extracts key legal elements (e.g., 'emotional breakdown' in divorce cases); based on the CAIL2019 element extraction dataset (62 labels)
Legal Event Detection (T_LED)：Identifies key legal events and their sequence; uses the LEVEN dataset
Internet Search (T_NET)：Calls the Bing API to obtain real-time legal dynamics These tools form a complete legal knowledge retrieval and reasoning system.

Section 05

Budget Control: A Constraint Mechanism to Balance Answer Quality and Response Efficiency

COLLT designs a budget control mechanism: each dialogue round can trigger at most two tool calls (|τ| ≤ 2), based on three considerations:

Latency control: Reduce response time and improve user experience
Prevent excessive retrieval: Avoid the model falling into meaningless retrieval loops
Focus on key information: Force the model to select the optimal tools within a limited budget Tool call results are incorporated into the final answer via the <ER> marker, completing the retrieval-augmented generation chain.

Section 06

Multi-Model Adaptation and Training Process

The research team used 4-bit QLoRA technology (based on the unsloth framework) to adapt five mainstream Chinese large models: ChatGLM3-6B→COLLT-GLM, LLaMa-3-8B→COLLT-LLaMa, InternLM3-8B→COLLT-InternLM, Qwen2.5-7B→COLLT-Qwen, Baichuan2-7B→COLLT-Baichuan. Training data construction is divided into three stages:

Extract 11,533 real legal consultation seed data from DISC-Law-SFT
Perform ambiguity annotation using the DeepSeek model to generate annot_ambig.jsonl
Annotate tool usage to build the collt_sft.jsonl training corpus

Section 07

Evaluation System and Data Open-Sourcing: Verifying the Framework's Effectiveness

COLLT builds a complete evaluation system:

AmbigLegalQA Evaluation Set: 5,181 test samples covering 0-4 rounds of clarification dialogues, evaluating trigger accuracy (trigger-F1), coverage, and ROUGE-L metrics
LawBench Zero-Shot Evaluation: Tests the model's comprehensive capabilities across 9 legal tasks (legal article prediction, charge prediction, sentence prediction, etc.) The project open-sources all resources: training corpus collt_sft.jsonl (11,528 entries), evaluation benchmark ambiglegalqa.jsonl (5,181 entries), tool training data, and end-to-end code.

Section 08

Practical Significance and Future Outlook: The Implementation Paradigm of Legal AI

COLLT provides an important paradigm for the implementation of legal AI:

Clarification Priority: Intelligently balances clarification and direct response, avoiding 'speaking nonsense seriously'
Tool Collaboration: The six tools form a complementary knowledge retrieval network, covering full-spectrum legal information from legal articles to cases
Budget Constraint: Balances answer quality and response latency through limits on tool call frequency The framework is applicable to scenarios such as online legal consultation, contract review, and compliance checks. In the future, it needs to address the balance challenge between 'answering as much as possible' and 'ensuring answer accuracy'.

Continue Reading

Keep going with more reads from the same topic.

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

SignalCut is an innovative web application that analyzes brands' visibility gaps in AI search, automatically generates evidence-based marketing strategies, and creates Hera video materials, helping early-stage brands gain a competitive edge in the AI answer engine era.

Recent activity 2026-04-26 11:27

AWS Open-Sources AI Search Citation Analysis System: Track Brand Exposure in AI Search Engines

An open-source project officially released by AWS, built on Amazon Bedrock, Step Functions, and React to form a complete serverless citation analysis system. It helps enterprises monitor their brand's citation status and competitive landscape in AI searches like ChatGPT, Perplexity, Gemini, and Claude.

Recent activity 2026-03-31 20:49

Next.js Application SEO and GEO Integrated Optimization Solution: Comprehensive Visibility from Search Engines to AI Assistants

This article delves into the stevewerme/seo-geo-nextjs project, an open-source tool designed specifically for Next.js applications to simultaneously optimize traditional search engine rankings (SEO) and generative engine visibility (GEO). It analyzes the project's core architecture, implementation mechanisms, practical application scenarios, and its strategic significance for developers and content creators.

Recent activity 2026-04-03 14:48

Baiyuan GEO Platform Technical White Paper: SaaS Engineering Practice for Generative Engine Optimization (GEO)

This article deeply analyzes the GEO Platform technical white paper developed by Baiyuan Technology, covering the seven-dimensional AI citation rate scoring algorithm, AXP shadow document delivery mechanism, Schema.org three-layer entity knowledge graph, and the hallucination automatic detection and repair closed-loop system, providing an engineering solution for brands to gain visibility in generative AI such as ChatGPT and Claude.

Recent activity 2026-04-18 22:54