Author: Al-Rahmawy, Mohamed Fathi Hamed./ Title: Intelligent expert system for articulate arabic text machine reader /

Search In this Thesis

العنوان

Intelligent expert system for articulate arabic text machine reader /

المؤلف

Al-Rahmawy, Mohamed Fathi Hamed.

هيئة الاعداد

باحث / Mohamed Fathi Hamed Alrahmawy

مشرف / Ali Ebrahim AlDesouki

مشرف / Hesham Arafat Ali

مناقش / Ali Ebrahim AlDesouki

الموضوع

Object-oriented programming (Computer science).

تاريخ النشر

2001.

عدد الصفحات

172 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

علوم الحاسب الآلي

تاريخ الإجازة

1/1/2001

مكان الإجازة

جامعة المنصورة - كلية الهندسة - Computer and Systems Engineering Department

الفهرس

Only 14 pages are availabe for public view

from

203

from

203

Abstract

This thesis aims mainly to build a fast and efficient Arabic OCR system using object-oriented programming technology in order to build an Articulating machine reader for the printed Arabic writing. Hence, the basic difficulties and the different characteristics of the Arabic text recognition problem are outlined. Then, different stages of OCR systems are reviewed and the basic approaches used in each stage are studied and the previous work in Arabic OCR is reviewed. Also, basic concepts of Neural Networks and its most general models are reviewed. Then, a summary of back propagation algorithm for learning is presented as the most widely learning algorithm. The virtues and limitations of Back-Propagation Learning are studied. Next, The details of the algorithms used in implementing the proposed system are studied. where, novel algorithms for document preprocessing and extracting the chain-coded inner and outer contours of the subwords of the document and representing them as objects are presented. Then, locations of these objects are analyzed in order to segment them in separate lines. Also, a new Arabic text segmentation algorithm is presented for segmenting the chain coded contours of the Arabic subwords into chain coded objects of the primitives of the characters (sub-characters) constituting these subwords. Then, a novel fast and efficient algorithm for extracting the central-moments features out of the chain-coded upper and lower outer contours of the segmented primitives is used in order to improve the feature extraction rate. For the classification of the primitives, a two-stage hybrid recognition system is implemented for the classification of the segmented primitives. The hybrid system uses two Neural networks in its first stage. The used neural networks are embedded within the system as objects and linked with its objects for clustering the primitives to be recognized into one of the predefined clusters. Then, in the second stage a set of classifiers (one classifier for each cluster) that use statistical, structural and heuristic rules of Arabic writing are implemented for the final classification of the primitives and building the characters. Also, a novel method for recombining the recognized sub-words into words using in-between spaces and language rules is presented. Finally, An Arabic word-based articulation sub-system is presented for articulating either the recognized text or simply from a text file. Keywords: OCR, Pattern Recognition, Neural networks, Image Processing, Over Segmentation, preprocessing, Chain Coding, Moments, Object Oriented Programming, Contour Processing, Hybrid Recognition System.