Search In this Thesis
   Search In this Thesis  
العنوان
Parallel Approaches for Detecting Complex Diseases Using Deep Learning \
المؤلف
Ghanem, Sahar Ibrahim.
هيئة الاعداد
باحث / سحر إبراهي?م عبد اللطيف غان?م
مشرف / محمد عبد الحميد اسماعيل احمد
drmaismail@gmail.com
مشرف / محمد صلاح الدين ابراهيم احمد
مناقش / مجدى حسين ناجى محمد
magdy.nagi@ieee.org
مناقش / محمد عبد الحميد اسماعيل احمد
drmaismail@gmail.com
مناقش / أمانى أنور سعد
الموضوع
Computer Engineering.
تاريخ النشر
2020.
عدد الصفحات
90 p. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
هندسة النظم والتحكم
تاريخ الإجازة
14/1/2020
مكان الإجازة
جامعة الاسكندريه - كلية الهندسة - هندسة الحاسب و النظم
الفهرس
Only 14 pages are availabe for public view

from 110

from 110

Abstract

Recently, in bioinformatics, the genome-wide association study (GWAS) of complex diseases is attracting a lot of research. An epistasis describes the analysis of Single Nucleotide Polymorphisms (SNPs) interactions and their effects on complex diseases. However, it is computationally expensive to test the enormous number of SNPs interactions against a disease. In this thesis, a Deep Learning (DL) technique is proposed to obtain a reliable detection of two-locus SNP interactions when exposed to different environmental factors. Next, the performance of the proposed technique is compared to other classical approaches such as Logistic Regression (LR), Multifactor dimensionality reduction (MDR) and associative classification based multifactor dimensionality reduction (MDRAC). The comparison is performed when the model is exposed to different types of noise. The presence of noise can exist in the form of missing data, genotyping errors, genetic heterogeneity, phenocopy or their hybrid effect. The comparison of the performance is shown through different six simulated data models in the absence of main effect. The empirical results show that the proposed DL approach gives robust and accurate results when compared to LR, MDR and MDRAC approaches. Moreover, a real dataset, Candidate type 2 Diabetes mellitus disease is used to verify the proposed model. The results show a remarkable pairwise epistasis effect for certain SNPs, although of their nonsignificant associated effect when tested individually. Furthermore, the proposed DL technique is extended to a parallel scenario, where it accelerates the long-time taken by the Deep Learning (DL) approach to detect the pairwise SNP interactions for a large dataset. Hence, the GWAS problem becomes more feasible on supercomputing systems. The parallel extension is applied on a supercomputer platform at the Bibliotheca Alexandrina. In addition, both simulated and real datasets are used in testing. The simulated datasets of 12 different models with main effects are being investigated. Then the real Wellcome Trust Case Control Consortium (WTCCC) Rheumatoid Arthritis (RA) dataset of 500K SNPs is tested. The empirical results show that the proposed parallel DL technique has high Accuracy(Acc), Specificity(Spc) and True Positive Rate(TPR) values. In addition, the new technique is shown to have low values of false discovery rate and power robustness for all simulated models. Furthermore, for the real RA dataset, the proposed technique shows the ability to detect 2-way interaction SNPs with their promising related genes with high accuracy due to the Parallel Deep Learning(PDL) architecture.