Section 01
Introduction: VDCores—A Resource-Decoupled Programming Model for Asynchronous GPUs
This article introduces VDCores, a decoupled programming model designed for the asynchronous hardware features of modern GPUs. By representing workloads as dependency-connected micro-operations and automatically scheduling overlapping memory operations and computations, it addresses the mismatch between traditional monolithic kernel programming models and GPU heterogeneous hardware, significantly improving LLM inference throughput while greatly reducing kernel programming complexity.