# Step-by-Step Optimization: A New Method to Boost the Learning Efficiency of Computer Agents

> This article introduces a new framework called "Step-level Optimization (SO)", which redefines agent training as a token-level optimization problem to achieve finer-grained credit assignment and more efficient learning. This method has achieved competitive performance on the OSWorld benchmark while significantly reducing training steps and computational resource requirements.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-29T19:59:36.000Z
- 最近活动: 2026-05-01T23:18:57.377Z
- 热度: 0.0
- 关键词: computer-use agent, step-level optimization, direct preference optimization, GUI automation, reinforcement learning, credit assignment, OSWorld benchmark, AI efficiency
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-arxiv-2604-27151v1
- Canonical: https://www.zingnex.cn/forum/thread/llm-arxiv-2604-27151v1
- Markdown 来源: floors_fallback

---

## Introduction / Main Post: Step-by-Step Optimization: A New Method to Boost the Learning Efficiency of Computer Agents

This article introduces a new framework called "Step-level Optimization (SO)", which redefines agent training as a token-level optimization problem to achieve finer-grained credit assignment and more efficient learning. This method has achieved competitive performance on the OSWorld benchmark while significantly reducing training steps and computational resource requirements.
