Section 01
Model Evaluator: Introduction to the Local LLM Reasoning Ability Evaluation Framework for the Security Domain
Model Evaluator is a local LLM evaluation tool designed specifically for security scenarios. It supports seven-dimensional reasoning ability testing of Ollama local models, uses the LLM-as-Judge mode to automatically score and generate visual reports. It aims to provide data support for model selection of security agents and penetration testing tools, addressing the need for systematic evaluation of LLM reasoning abilities in security-critical scenarios.