مشروع البحث:
A Hybrid Deep Learning Technique for Arabic Voice Recognition

تحميل...
thumbnail.project.alt
المساهمين
الممولين
رقم التعريف
24471
الباحث
Fathiyah Antaat Habeeb
المشرفين
منشورات
وحدات تنظيمية
الوصف
Abstract Speech recognition is a valuable tool in various industries; however, achieving high accuracy remains a major challenge, despite the rapid growth of the speech recognition market. Arabic in particular lags behind other languages in the field of speech recognition, requiring further attention and development. To address this issue, this research uses deep neural networks to develop an automatic Arabic speech recognition model based on isolated words technology. A hybrid technique, which is originally developed by Radfar et al. [1] for English speech recognition, is adopted to be used for Arabic speech recognition. This technique combines the strengths of recurrent neural networks (RNNs), which are critical in speech recognition tasks, with convolutional neural networks (CNNs) to form a hybrid model known as ConvRNN. The adopted technique is trained using an Arabic speech publicly available dataset of isolated words, along with a custom-generated dataset specially prepared for this research. The built model's performance has been evaluated using standard metrics, including word error rate (WER), accuracy, precision, recall, and F-measure (also referred to as f-score). In addition, K-fold cross-validation method has been employed to ensure robustness and generalizability. The results demonstrated that the ConvRNN model achieved a high accuracy rate of 95.7% on unseen data, with a minimal WER of just 4.3%. These findings highlight the model's effectiveness in accurately recognizing Arabic speech with minimal errors. Comparisons with similar models from previous studies further validated the superiority of the ConvRNN model. Overall, the ConvRNN model shows great promise for applications requiring accurate and efficient Arabic speech recognition. This research contributes to narrowing the gap in Arabic speech recognition technology, offering a robust solution for accurately converting Arabic speech into text
الكلمات الدالة
تجمع هذه التقنية بين نقاط قوة الشبكات العصبية المتكررة (RNNs