Problem Background
Qwen3 is a new-generation large language model launched by Alibaba's Tongyi Qianwen team, supporting Chain-of-Thought reasoning mode. In this mode, the model outputs reasoning processes wrapped in <think>...</think> tags, followed by the final answer.
vLLM's streaming output needs to parse these tags correctly to handle reasoning content and final answers separately. The official vLLM parser encounters character encoding and line break handling issues on Windows.
Solution
The project fixes the Qwen3 reasoning parser for Windows' character processing characteristics:
- Encoding Compatibility: Correctly handles Windows' CRLF line breaks
- Buffer Handling: Optimizes the buffering strategy for streaming output
- Tag Parsing: Ensures
<think> tags are correctly identified in Windows text mode
This allows Windows users to fully experience Qwen3's reasoning capabilities, including observing the model's thinking process.