Section 01
[Introduction] Hybrid Reinforcement Learning and LLM-Based Agent Decision-Making Framework: Dual-Track Exploration in Wumpus World
This article introduces a Wumpus World solving framework integrating pure reinforcement learning and language model enhancement methods, exploring the implementation principles and comparative value of two technical routes: PPO-based recurrent neural networks and SFT+GRPO-based LLM reasoning and decision-making. Through its dual-track parallel design, this project provides a comparative sample for understanding the advantages and disadvantages of different AI paradigms.