Section 01
GenAI-Bench: Guide to Fine-Grained Performance Evaluation Tool for LLM Inference Services
GenAI-Bench is a fine-grained performance evaluation tool designed specifically for LLM inference service systems, supporting token-level performance analysis to help developers accurately evaluate and optimize model service performance. It addresses the problem that traditional coarse-grained evaluations (such as overall latency and throughput) struggle to identify system bottlenecks, focusing on key metrics like Time to First Token (TTFT) and Token Per Output Time (TPOT), providing deep insights for the optimization of LLM inference services.