Section 01
KSAA-2026 Arabic Speech Automatic Diacritization Champion Solution Guide
This article introduces the first-place system of the second task in the KSAA-2026 shared task. The system achieves automatic diacritization of Arabic speech by fine-tuning the CATT-Whisper multimodal model with regularization, under the constraints of only 2327 training samples and no external data allowed, achieving a word error rate (WER) of 23.26% and ranking first in the task.