Author: Balaha, Hossam Magdy Hassan./ Title: Proposing a deep learning framework model for Arabic handwritten text recognition /

Search In this Thesis

العنوان

Proposing a deep learning framework model for Arabic handwritten text recognition /

المؤلف

Balaha, Hossam Magdy Hassan.

هيئة الاعداد

باحث / حسام مجدي حسن حسين بلحه

مشرف / هشام عرفات علي

مشرف / محمود بدوي

مناقش / احمد ابراهيم محمد صالح

مناقش / أحمد أبوالفتوح صالح

الموضوع

Machine learning. Neural networks (Computer science). Computer program language. Pattern recognition systems.

تاريخ النشر

2020.

عدد الصفحات

online resource (140 pages) :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

هندسة النظم والتحكم

تاريخ الإجازة

1/1/2020

مكان الإجازة

جامعة المنصورة - كلية الهندسة - قسم هندسة الحاسبات والنظم

الفهرس

Only 14 pages are availabe for public view

from

183

from

183

Abstract

Arabic is a very important language as it is the official language of almost twenty-six countries, is one of the six official languages of the United Nations and is spoken by more than half a billion people. There are numerous individuals nowadays still take notes traditionally with pen and paper. There are numerous disadvantages to that approach. Handwritten text is difficult to be stored physically and accessed efficiently. Searching through it and sharing with others are tedious tasks. With the absence of that text in a digital form, a ton of significant information and important knowledge may be lost and not used effectively. Deep learning is one of the model-based classification techniques. Deep neural networks have got effective enhancements in the text classification process, detection and learning. It is one of the most researched areas in the last few years and its models and techniques reached the state-of-the-art performance. There is a necessity for solutions with the help of deep learning techniques and segmentation algorithms to convert the physically stored handwritten texts in a digital suitable format efficiently and accurately. The objectives of the current study are: holding a comprehensive survey related to Arabic handwritten text recognition systems, proposing an overall Arabic Handwritten Text Segmentation and Recognition System (AHTSRS), suggesting an algorithm for text to lines segmentation named HMB-P2LS, suggesting an algorithm for lines to words segmentation named HMB-L2WS, suggesting an algorithm for words to characters segmentation named HMB-W2CS, proposing an Arabic Handwritten character Recognition Deep Learning System (AHCR-DLS), suggesting another improved version of the AHCR-DLS named AHCR-IDLS, proposing two convolutional neural network architectures named HMB1 and HMB2, preparing and designing a large and complex dataset named HMBD, compiling and training deep learning architectures for the Arabic handwritten text recognition, applying the available optimization and regularization methods, evaluating the architectures’ results by applying different experiments using different datasets including the built one, gauging the effects of changing the weight initializers, optimizers, data augmentation and regularization on the overfitting, time complexity and performance metrics (such as accuracy, recall, precision and F1) and applying a cross-over comparison of the current system with other similar systems to validate its generalization. Two experiments were applied using the HMB-P2LS algorithm and they reported higher accuracies compared to other reported studies. They were 98.88% and 85.16% for the IAM Handwriting and KHATT datasets respectively. Sixteen experiments were applied using the AHCR-DLS and the conductions were: (i) HMB1 reported the highest testing accuracy of 98.4% with 865,840 records using augmentation on HMBD, (ii) HMB1 reported relatively higher accuracies than HMB2 and (iii) CMATER and AIA9k datasets were used for validating the generalization, data augmentation was applied and the best results were 100% and 99.0% for testing accuracies respectively. The HMBD dataset is compared with other published datasets and reported the first dataset to include the different positions of Arabic letters beside the digits. The cross-over validation between the described architectures and a previous state-of-the-art architecture and dataset was performed in two phases. First, the previous control architecture cannot generalize for the presented dataset in the current study. Second, the study described architectures generalize for the control dataset, with higher accuracies (97.3% and 96.8% for HMB1 and HMB2 respectively), than the reported accuracy in the selected control study. Five experiments were applied using the AHCR-IDLS and the conductions were: (i) HMB1 reported the highest testing accuracy of 92.28% with using data augmentation on HMBD, (ii) Increasing the image size led to a decrement in the testing accuracy and (iii) Removing the boundary extra whitespaces and neglecting the centralization led to a relatively big DROP in the testing accuracy.