Search In this Thesis
   Search In this Thesis  
العنوان
Automatic recognition of Arabic spoken language \
المؤلف
Awadalla, Mohamed Mohamed Ahmed.
هيئة الاعداد
باحث / Mohamed Mohamed Ahmed Awadalla
مشرف / Fatma El-­Zahra mohamed rashad Abu­ Shadi
مشرف / Hassan Hussein Soliman
مشرف / Abd-ElFattah Sayed Ahmed Mohamed
الموضوع
Speech recognition. Cepstrum.
تاريخ النشر
2006.
عدد الصفحات
187 p. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
الهندسة الكهربائية والالكترونية
تاريخ الإجازة
1/1/2006
مكان الإجازة
جامعة المنصورة - كلية الهندسة - Department OF Electronics and Communications Engineering
الفهرس
Only 14 pages are availabe for public view

from 212

from 212

Abstract

Automatic Speech Recognition (ASR) has proven to be a useful tool in many applications in our daily life. These applications could be interactive man­machine interface, aides for disabled, automatic assistance by telephone, and Multimedia Information Retrieval (MMIR) systems. Current state of­the­art of Automatic Speech Recognition system (ASR) achieve a lot of success for English and other language like French, Dutch, Italy etc. However, for Arabic language few research work exists in this field. The present work aims to develop an automatic system for recognizing Arabic spoken language that can be used in multimedia information retrieval environment for the Arabic language as a first step to convert the spoken Arabic words to text. As a first step, an Arabic speech database has been developed. A number of quality control rules and precautions were followed for collecting the Arabic speech database. The Arabic speech database was obtained through recording speech material from Arabic broadcast and TV news spoken by different heralds from different Arab countries. The recorded speech materials were segmented manually into frames of 24ms length. Each frame contains one Arabic phoneme. The application of appropriate signal analysis and pattern recognition techniques has enabled important features of the records to be recognized and clarified. Two different approaches were utilized in feature extraction. First, to analyze the speech segment as one frame, and second to analyze the speech segment as three sub­frames with 50% overlapping and extract speech feature from each sub frame. Three techniques for feature extraction have been implemented: linear predictive coding coefficients, the cepstrum coefficients calculated using the discrete Fourier transform, and using the wavelet transform For the recognizer, a 3, 5, and 7 states left to right Hidden Markov Model (HMM) approach, and a neural network, classifier (feed forward and recurrent) were utilized. In an attempt to improve the recognition accuracy, a data fusion approach was used. This improves greatly the recognition accuracy. Moreover, a multi­stage feed forward neural network was utilized as a recognizer. It is concluded that the result of the multi­stage feed forward neural networks are promising. It gives the highest recognition accuracy which reaches 83.3%.