Search In this Thesis
   Search In this Thesis  
العنوان
Developing a system for documents’ maintenance and retrieval in educational institutions based on AI techniques /
المؤلف
Habib, Islam Gamal Ahmed Badawi Mohamed.
هيئة الاعداد
باحث / اسلام جمال احمد بدوى محمد حبيب
مشرف / عطا إبراهيم إمام الألفى
مشرف / عطا إبراهيم إمام الألفى
مناقش / حاتم مختار مختار البكري
مناقش / أحمد السيد أحمد أمين
الموضوع
Artificial Intelligence (AI).
تاريخ النشر
2023.
عدد الصفحات
127 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Artificial Intelligence
تاريخ الإجازة
1/1/2023
مكان الإجازة
جامعة المنصورة - كلية التربية النوعية - Computer Teacher Preparation Department
الفهرس
Only 14 pages are availabe for public view

from 127

from 127

Abstract

Documents are of great importance to governmental and educational institutions because of their scientific, cultural and legal value. Degradation documents suffer from some problems such as background and foreground effects. These problems like non-uniformity of density, dirt, spots, liquid, missing words, etc. Degraded document processes must be developed. The general aim of this thesis was to develop an intelligent system for the maintenance and retrieval of documents in educational institutions. Degradation cases are enhanced in the proposed system to remove noise and abnormal spots from the background and foreground using the Adaptive Thersholding Technique. Hough transform technique is used to de-skewed documents. Maximally Stable Extermal Regions (MSER) technique is used to extract the text features for identifying them. The extracted text was spellchecked using the Levenshtein distance technique to reduce the error rate. The missing words are processed with the extracted text through the application of two methods, the first which is N-gram and the second is a set of steps to search for similarity between repetition of the missing word in terms of what was before and after this word within the same degraded document or other documents belonging to the same private area. The proposed system was tested and evaluated on degradation of printed documents that are written in the English language. Performance measures are used to evaluate the proposed system using: Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR), F-Measures, Accuracy, Negative Rate Metric (NRM), Misclassification Penalty Metric (MPM) and Distance Reciprocal Distortion Metric (DRD) are used. High performance ratios 99% have been achieved through readable text.