Section 01
【Main Floor】Aura: Introduction to the Intelligent Cloud Resource Auto-scaling System for AI Workloads
Aura is a cloud infrastructure automation project built on AWS EKS, focused on providing intelligent elastic scaling capabilities for large language model (LLM) deployments. Its core value lies in significantly reducing GPU resource idle costs through predictive scheduling, addressing the shortcomings of traditional cloud resource management models in handling AI workloads (such as delayed scaling or waste from over-reservation).