# OpenSeeker-v2: A Cutting-Edge Search Agent Trained Only with SFT

> OpenSeeker-v2 achieves SOTA on multiple search benchmarks using only 10.6k samples and SFT training through a high-quality data synthesis strategy, challenging the complex CPT+SFT+RL training paradigm commonly used in the industry.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-05T17:55:25.000Z
- 最近活动: 2026-05-07T01:37:55.682Z
- 热度: 117.3
- 关键词: 搜索智能体, 大语言模型, 监督微调, 数据合成, Agent, SFT, BrowseComp
- 页面链接: https://www.zingnex.cn/en/forum/thread/openseeker-v2-sft
- Canonical: https://www.zingnex.cn/forum/thread/openseeker-v2-sft
- Markdown 来源: floors_fallback

---

## OpenSeeker-v2: Introduction to the Cutting-Edge Search Agent Trained Only with SFT

OpenSeeker-v2 achieves state-of-the-art (SOTA) performance on multiple search benchmarks using only 10.6k samples and supervised fine-tuning (SFT) training through a high-quality data synthesis strategy, challenging the complex CPT+SFT+RL training paradigm commonly used in the industry. This article will analyze it from aspects such as background, methods, and experimental results.

## Background and Challenges: Thoughts on Breaking the Complex Training Paradigm in Industry

Deep search capability is the core competitiveness of cutting-edge large language model agents, but this field has long been dominated by industrial giants. Their training process involves multi-stage steps such as pre-training, continuous pre-training (CPT), SFT, and reinforcement learning (RL), which are costly and form academic barriers. The research team raised a question: Is it necessary to rely on complex processes to build cutting-edge search agents? They believe that under high-quality trajectory data training, simple SFT can also achieve excellent results.

## Core Methods: Three Key Improvements in Data Synthesis

The success of OpenSeeker-v2 comes from the optimization of data synthesis strategies, which includes three elements:
1. **Expand knowledge graph scale**: Expand the coverage of the knowledge graph to provide a richer exploration space and improve generalization ability;
2. **Expand toolset scale**: Add callable tools (including professional retrieval interfaces) to handle complex queries;
3. **Strict low-step filtering**: Screen trajectories that complete complex tasks with fewer steps to ensure the efficiency of training data.

## Experimental Results: Challenging Industry SOTA with Only SFT Training

OpenSeeker-v2 was trained with 10.6k samples and outperformed Tongyi DeepResearch (which uses the CPT+SFT+RL process) in four mainstream benchmark tests:
| Benchmark | OpenSeeker-v2 | Tongyi DeepResearch |
|---------|--------------|---------------------|
| BrowseComp | **46.0%** | 43.4% |
| BrowseComp-ZH | **58.1%** | 46.7% |
| Humanity's Last Exam | **34.6%** | 32.9% |
| xbench | **78.0%** | 75.0% |
This proves that data quality is superior to training complexity.

## Technical Significance: Breaking Monopolies and Re-examining Training Processes

The significance of OpenSeeker-v2:
1. **Break industrial monopoly**: The first SOTA search agent of the same scale developed by an academic team;
2. **Re-examine training philosophy**: Simple training methods (SFT) combined with high-quality data can surpass complex processes, providing ideas for resource-constrained teams;
3. **Emphasize the importance of data engineering**: Data quality is the key to model performance, and the three data strategies provide reusable methodologies.

## Limitations and Future Directions: Room for Further Exploration

**Limitations**: Based on the 30B model and ReAct paradigm, no exploration of larger models or other architectures; the optimal configuration of data synthesis strategies needs to be adjusted according to tasks.
**Future Directions**: Explore the combination of SFT and lightweight RL; apply data strategies to other agent tasks; reduce dependence on large-scale knowledge graphs.

## Conclusion: Victory of Simple Methods + High-Quality Data

OpenSeeker-v2 proves that simple methods combined with high-quality data can defeat complex engineering stacks, providing a more cost-effective technical path for the industry. With the model weights open-sourced, we look forward to more follow-up research emerging.
