Section 01
Noema Project Introduction: Exploring Latent Space Reasoning on Consumer GPUs
The Noema project focuses on exploring the reasoning capabilities of small language models (≤300 million parameters) in continuous latent spaces, aiming to replace the traditional discrete Chain-of-Thought (CoT) token approach to improve sample efficiency, reasoning depth, and speed. The core goal of the project is to verify whether small models can achieve efficient reasoning through continuous latent spaces, with an emphasis on hardware friendliness—all experiments can be reproduced on a single RTX 3060 (8GB VRAM), promoting the democratization of AI research.