Section 01
Introduction: LoKA Framework—A Systematic Solution for Enabling FP8 Low-Precision Computing in Large-Scale Recommendation Models
LoKA addresses the numerical sensitivity and communication bottlenecks faced by Large-Scale Recommendation Models (LRMs) in FP8 low-precision computing through system-model co-design, achieving a balance between training efficiency and model quality. The framework includes three core principles: precise profiling based on real distributions, co-design of model components and hardware, and intelligent orchestration across kernel libraries, providing a systematic methodology for the implementation of FP8 in recommendation systems.