Section 01
GPUSCALE Project Introduction: A Benchmarking Platform for LLM Inference in Large-Scale GPU Selection and Rental
GPUSCALE is a GPU benchmarking project for large-scale AI workloads, aiming to provide data support for GPU procurement and rental decisions. The project supports local GPUs and cloud GPU services (Vast.ai, RunPod), and collects key metrics such as tokens per second, first token latency, VRAM usage, and power consumption through standardized containerized testing processes. It helps AI service providers and researchers make informed decisions and provides a reference benchmark for the design of new accelerators.