Section 01
[Introduction] Medical LLM Diagnostic Accuracy ≠ Safety! Only 6.7% Safety Pass Rate Behind 93.3% Accuracy
A study by Wrexham Glyndwr University reveals a striking gap in medical LLMs: For acute chest pain cases, Gemini 3.1 Pro achieves a diagnostic accuracy of 93.3%, but its clinical safety pass rate is only 6.7%, with a hallucination rate of 76.7%. The research team open-sourced a comprehensive evaluation framework containing 11 indicators, emphasizing that clinical safety requires balancing results and reasoning processes.