Section 01
Core Introduction to VLA Data Forge
VLA Data Forge is a research-grade Python framework specifically designed for curating and preprocessing reasoning-aware embodied datasets for Vision-Language-Action (VLA) model training. It supports the Embodied-CoT and Bridge v2 datasets, provides multi-backend (Gemini, GPT-4o, Qwen-VL) VLM reasoning trajectory generation capabilities, and bridges the gap between raw robot demonstration data and VLA models that require explicit reasoning abilities.