Zing Forum

Reading

Consensus Mechanism-Based Agent Framework: Using Large Language Models to Solve Customs Code Classification Challenges

The research team proposes a multi-agent LLM framework that addresses the complex problem of Harmonized Tariff Schedule (HTS) code classification through semantic retrieval, evidence-based reasoning, and consensus validation, and verifies the necessity of human-machine collaboration using 3300 real-world data entries.

HTS编码海关分类智能体框架大语言模型共识机制语义检索人机协作智能港口
Published 2026-06-16 01:24Recent activity 2026-06-16 12:53Estimated read 6 min
Consensus Mechanism-Based Agent Framework: Using Large Language Models to Solve Customs Code Classification Challenges
1

Section 01

[Introduction] Consensus Mechanism-Based Agent Framework: Solving Customs Code Classification Challenges

The research team proposes a multi-agent LLM framework that addresses the complex problem of Harmonized Tariff Schedule (HTS) code classification through semantic retrieval, evidence-based reasoning, and consensus validation. The framework verifies the necessity of human-machine collaboration using 3300 real-world data entries. The original paper is from an arXiv preprint (published on June 15, 2026) by the Analytics-Everywhere-Lab team.

2

Section 02

Problem Background: Why is HTS Code Classification So Challenging?

The Harmonized Tariff Schedule (HTS) is an internationally used commodity classification system for tariffs, regulation, etc. Classification faces multiple obstacles: product descriptions often lack key details, the HTS system is complex (e.g., Canada's 10-digit structure), and rules vary across jurisdictions. Traditional machine learning struggles to handle deep domain knowledge requirements, and pure LLM black-box predictions fail to meet the precise and interpretable business demands.

3

Section 03

Framework Design: Core Components of Multi-Agent Collaboration

The framework decomposes the classification task into subtasks completed collaboratively by agents, with core components including:

  1. Multi-agent information retrieval: Collecting information from multiple sources such as official documents and historical cases;
  2. Semantic retrieval: Understanding the deep meaning of queries, e.g., identifying special regulations for medical device bearings;
  3. Evidence-supported reasoning: Each decision must be accompanied by bases like legal clauses and annotations;
  4. Consensus validation and hierarchical voting: Voting separately for each HTS level (chapter, heading, etc.), and only accepting decisions that reach consensus;
  5. Confidence estimation: Triggering manual review when confidence is below a threshold to balance human-machine collaboration.
4

Section 04

Experimental Validation: Performance on 3300 Real-World Data Entries

The research team evaluated the framework on 3300 real logistics data entries annotated by domain experts. Results show that LLM performance decreases significantly as classification levels become more granular (coarse-grained chapter level is acceptable, but fine-grained suffix assignment accuracy is low), indicating that fully automated end-to-end classification remains challenging and relying solely on model predictions carries risks.

5

Section 05

Key Insight: Why Does HTS Classification Require Human-Machine Collaboration?

The experiments support the core hypothesis: In high-risk, high-precision tasks, evidence-based and uncertainty-aware human-machine collaboration outperforms single-step predictions. Reasons include: classification errors can lead to legal and economic consequences (tariff underpayment, compliance violations), rules are frequently updated requiring knowledge maintenance, and boundary cases need expert judgment. The framework balances efficiency and reliability through a confidence mechanism.

6

Section 06

Application Value: Practical Significance for Smart Ports and Maritime Logistics

The framework is applicable to smart port and maritime logistics scenarios, capable of handling large volumes of cargo classification declarations while balancing efficiency and accuracy. Its interpretable design meets industry transparency requirements—each decision has clear evidence support, facilitating audits and dispute resolution, and helping to accelerate customs clearance and build trade trust.

7

Section 07

Technical Implications and Future Directions

This research provides a reference for LLM applications in vertical domains: For tasks requiring professional knowledge and precise reasoning, LLM should be used as the core of agents, combined with knowledge retrieval, reasoning validation, and human collaboration. Future directions include exploring ways to improve fine-grained classification accuracy (e.g., domain fine-tuning, enriched data), and expanding to other jurisdictions and similar precise classification fields.