Section 01
llm-eval: Core Guide to the Self-Hosted Evaluation Framework for Local Large Language Models
llm-eval is a self-hosted evaluation framework designed specifically for local large language models. Built on llama.cpp's OpenAI-compatible endpoint, it supports multi-dimensional capability testing (reasoning, programming, code quality, etc.), provides two difficulty levels (basic/difficult) and a comparison feature for enabling/disabling the thinking mode, helping developers quickly and reliably evaluate model capabilities in a local environment.