# BeDiscovER: A Discourse Understanding Benchmark for the Era of Reasoning Language Models

> A comprehensive discourse understanding evaluation benchmark accepted by EACL 2026, covering five tasks: conversational discourse parsing, discourse marker understanding, discourse relation recognition, sentence ordering, and temporal reasoning. It is specifically designed to evaluate the discourse understanding capabilities of reasoning language models.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-17T05:15:24.000Z
- 最近活动: 2026-04-17T05:21:42.991Z
- 热度: 150.9
- 关键词: 语篇理解, 基准测试, EACL 2026, 推理语言模型, 对话语篇解析, 语篇关系识别, 时间推理, 句子排序
- 页面链接: https://www.zingnex.cn/en/forum/thread/bediscover
- Canonical: https://www.zingnex.cn/forum/thread/bediscover
- Markdown 来源: floors_fallback

---

## BeDiscovER: Introduction to the Discourse Understanding Benchmark for the Era of Reasoning Language Models

BeDiscovER is a comprehensive discourse understanding evaluation benchmark accepted by EACL 2026, specifically designed to assess the discourse understanding capabilities of reasoning language models. It covers five core discourse tasks: conversational discourse parsing, discourse marker understanding, discourse relation recognition, sentence ordering, and temporal reasoning. Its aim is to systematically evaluate models' discourse-level capabilities and promote the development of discourse understanding in the NLP field.

## Discourse Understanding: Challenges and Needs in the NLP Field

In recent years, the natural language processing field has made great progress in sentence-level and word-level tasks, but discourse-level understanding remains an open challenge. Discourse understanding involves analyzing relationships between text units, identifying logical structures, and integrating cross-sentence information—it is the key to truly understanding language. With the rise of reasoning language models, how to systematically evaluate the discourse capabilities of such models has become an urgent problem for the academic community to solve.

## Design of BeDiscovER's Five Core Tasks

BeDiscovER covers five core discourse tasks:
1. **Conversational Discourse Parsing**: Identify discourse structures in conversations (unit segmentation, relation recognition), integrating authoritative datasets such as STAC and Molweni;
2. **Discourse Marker Understanding**: Test understanding of the semantic functions of markers like "however" ("然而") and "therefore" ("因此"), based on the Just and Otherwise datasets;
3. **Discourse Relation Recognition**: Determine logical relationships (causal, contrastive, etc.) between discourse units, integrating data from the DISRPT 2025 shared task;
4. **Sentence Ordering**: Restore the correct order of shuffled sentences, reflecting grasp of coherence, with data from multiple domains such as academic abstracts and stories;
5. **Temporal Reasoning**: Understand temporal relationships between events (sequence, simultaneity, etc.), based on time-annotated datasets like TimeBank-Dense.

## Dataset Organization and Usage of BeDiscovER

BeDiscovER adopts a clear data organization method, with each task having an independent directory and documentation. The project provides a unified data loading script, supporting flexible selection of datasets and configuration of sampling ratios. Data formats for different tasks are adapted to their characteristics: conversational discourse parsing uses JSON format, sentence ordering uses JSONL format, and discourse relation recognition supports automatic expansion of the DISRPT test set, facilitating cross-task comparison experiments for researchers.

## Why Does BeDiscovER Focus on Reasoning Language Models?

The name BeDiscovER reveals its era background—the era of reasoning language models. Traditional models focus on surface pattern matching, while reasoning models exhibit stronger logical reasoning capabilities through chain-of-thought. However, discourse understanding requires modeling long-distance dependencies, identifying implicit relationships, and grasping global structures. BeDiscovER is precisely designed to test the performance of reasoning models in these higher-level discourse capabilities.

## Academic Value and Application Prospects of BeDiscovER

As a paper accepted by EACL 2026, BeDiscovER has important academic value: it provides a standardized evaluation platform, reveals connections between different dimensions of discourse understanding through multi-task design, and helps researchers analyze models' strengths and weaknesses. For the industry, it guides application scenarios such as dialogue systems, document understanding, and knowledge extraction—developers can select suitable models and training strategies through evaluation.

## Summary and Outlook of BeDiscovER

BeDiscovER represents an important attempt to develop discourse understanding evaluation toward comprehensiveness and multi-dimensionality. It reminds us that truly understanding language requires grasping the macro structure and logical relationships of text, not just lexical and syntactic knowledge. As reasoning language models continue to evolve, BeDiscovER will become an important force driving progress in the field of discourse understanding.