Search In this Thesis
   Search In this Thesis  
العنوان
A new classification strategy for diseases diagnosis /
المؤلف
Mohamed, Alaa Mostafa.
هيئة الاعداد
باحث / الآء مصطفي محمد
مشرف / محي الدين أبو السعود
مشرف / أحمد ابراهيم محمد صالح
مشرف / دعاء عادل الطنطاوي
مناقش / رشدي أبو العزايم عبدالرسول
الموضوع
Covid-19 - Diagnosis. Data mining. Improved KNN.
تاريخ النشر
2022.
عدد الصفحات
online resource (114 pages) :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الهندسة الكهربائية والالكترونية
تاريخ الإجازة
1/1/2022
مكان الإجازة
جامعة المنصورة - كلية الهندسة - هندسة الالكترونيات والاتصالات
الفهرس
Only 14 pages are availabe for public view

from 114

from 114

Abstract

Covid-19, although 3 years have passed since this epidemic first appeared in Wuhan, China, since September 2019, there are still cases of infection recorded every day, in addition to deaths. Even with the measures taken by all countries of the world in an attempt to limit this epidemic spread after the great losses was cause in all walks of life. There are Some infected people do not show symptoms or have symptoms similar to the common cold. Therefore, misdiagnosing process for these patients poses a great danger to the people with whom they are in contact. Consequently, a quick and accurate diagnosis process is the best weapon to confront the spread of this epidemic, so that the medical authorities can provide appropriate health care for these patients in an attempt to limit the spread of this epidemic and prevent the emergence of more cases. Therefore, this thesis presents a new strategy for effective and rapid diagnosis of Covid-19 patients. The proposed strategy consists of two main stages which are the data reprocessing stage and the Covid-19 diagnosis stage. The outlier rejection phase and the feature selection phase are two stages within the data reprocessing stage. The data reprocessing stage is a critical stage in the diagnostic process as it filters the dataset from any inaccurate or missing data allowing the diagnostic process to provide reliable results. Outliers often have a negative effect on model training which may eventually lead to overfitting. So the Interquartile Range (IQR) technique has been used to discover invalid cases in datasets and then replaced with an average value for each feature. The second phase in the data reprocessing stage is The feature selection phase mainly aims to identify the more suitable features for the diagnosis stage. where the most important features are identified using the chi-squared feature selection method. Then, a quick and accurate diagnosis has been provided using the ensemble classification model. where this model combines three classification methods namely Naive Bayes (NB) classifier, Decision Tree (DT) classifier, and Improved k-Nearest Neighbor (Improved KNN) classifier. The main idea of the proposed Improved KNN classifier is to create a circle with a radius equal to the average distance of the K items closest to the testing item, and then select the nearest M of the items within this circle to classify the testing item into the correct category ”Covid” or “non-Covid” according to majority vote. In the ensemble classification model, the three classifiers are learned simultaneously on the same training dataset and then combined with the output of them to provide a final diagnostic based on a majority vote rather than relying on a single classifier. The model was trained on three datasets: “Data 1 and Data 2” is a dataset from blood samples and “Data 3” is a dataset from CT images. The experimental results showed that the proposed technique outperformed the other current strategies that were compared with it, as the proposed strategy provided higher accuracy and less error.