COMPARATIVE ANALYSIS OF THE RESULTS OF FINE-TUNING AUTOMATIC SPEECH RECOGNITION MODELS FOR THE UZBEK LANGUAGE
Keywords:
automatic speech recognition, Uzbek language, fine-tuning of models, low-resource languages, Whisper, Wav2Vec 2.0, Turkic languages, WER, transfer learning.Abstract
The article presents a comparative analysis of ten automatic speech recognition (ASR) models as applied to the Uzbek language, which belongs to the category of low-resource languages. The architectures examined include Whisper, Wav2Vec 2.0 XLSR-53, XLS-R, HuBERT, Conformer, MMS, DeepSpeech2, NeMo Conformer, and w2v-BERT 2.0. A series of experiments was conducted on the fine-tuning of pretrained models using a 120 hour Uzbek speech corpus. Quality assessment was carried out using the WER (Word Error Rate) metric. The results show that the fine-tuned w2v-BERT 2.0 model demonstrates the lowest WER score (13.8%), while Whisper large-v3 reaches 12.4% after fine-tuning. Specific difficulties in processing Uzbek speech were identified, including agglutinative morphology, variability of phonetic realization, and the limited availability of annotated data.
References
1. Кипяткова И.С. Карпов А.А. Разновидности глубоких искусственных нейронных сетей для систем распознавания речи // Труды СПИИРАН. – 2016. № 6(49). – С. 80-103.
2. Radford A., Kim J.W., Xu T., Brockman G., McLeavey C., Sutskever I. Robust Speech Recognition via Large-Scale Weak Supervision // OpenAI Technical Report. – 2022. https://cdn.openai.com/papers/whisper.pdf
3. Гапочкин А. В. Нейросетевые методы для распознавания речи // Альманах современной науки и образования. – 2014. № 3 (82). – С. 55-58.
4. Conneau A., Baevski A., Collobert R., Mohamed A., Auli M. Unsupervised Cross-Lingual Representation Learning for Speech Recognition // Proc. Interspeech – 2021. Brno, Czechia, – 2021. – P. 2426-2430.
5. Хлопенкова А. Ю., Белов Ю. С. Исследование алгоритмов автоматического распознавания речи на основе акустического и языкового моделирования // Научное обозрение. Технические науки. – 2018. № 1. – С. 32-36.
6. https://www.iksmedia.ru/news/6077097-V-Kazaxstane-razrabotana-ASRmodel.html