Author: El-Hasnony, Ibrahim Mohamed El-Sayed./ Title: Big data analysis in IOT environment /

Search In this Thesis

العنوان

Big data analysis in IOT environment /

المؤلف

El-Hasnony, Ibrahim Mohamed El-Sayed.

هيئة الاعداد

باحث / إبراهيم محمد السيد الحسنونى

مشرف / شريف ابراهيم بركات

مشرف / ريهام رضا مصطفى

مناقش / أحمد أبوالفتوح صالح

مناقش / ميرفت أبوالخير

الموضوع

Information Systems. Computer communication systems.

تاريخ النشر

2020.

عدد الصفحات

online resource (153 pages) :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Information Systems

تاريخ الإجازة

1/1/2020

مكان الإجازة

جامعة المنصورة - كلية الحاسبات والمعلومات - Department of Information Systems

الفهرس

Only 14 pages are availabe for public view

from

153

from

153

Abstract

Internet of Things (IoT) emerged as one of the leading technological advancements of our days. IoT generates enormous quantities of valuable data that need on time processing, resulting in reliable and accurate decisions based on the Internet of Things vision. The quality of the generated data is inadequate, incomplete, uncertain, and produced from multiple sources. Although cloud servers can analyze and store enormous data, they need a lot of time to send full-size data for storage and analysis as well as the high overhead they have that not satisfactory in many applications. One of the most important data-preprocessing tasks is the feature selection. Although there are many attempts to build an optimal model for feature selection in big data applications, the complex nature of processing such kind of data makes it still a big challenge. Accordingly, the data mining process may be obstructed due to the high dimensionality and complexity of huge data sets. For the most informative features and classification accuracy optimization, the feature selection process constitutes a mandatory pre-processing phase to reduce dataset dimensionality. The exhaustive search for the relevant features is time-consuming. On another hand it is important to develop an effective model for prediction tasks especially in healthcare applications. Throughout recent years, the progress of telemonitoring and telediagnostics devices for evaluating and tracking Parkinson’s disease (PD) has become increasingly important. The early detection of PD increases the consistency of the treatment of patients and ultimately allows it possible to achieve a rapid diagnostic decision from an experienced clinician. Although individual non-linear machine learning techniques perform better, such models suffer from some overfitting and parameter optimization problems. Therefore, hybrid models have been introduced to increase predictive accuracy and overcome the weaknesses of the solo models. Adaptive neuro-fuzzy inference system (ANFIS) is a soft computation approach that includes the powers of fuzzy inference mechanisms as well as of artificial neural networks (ANN). ANFIS is driven by strong generalization capacity with a quick and accurate learning process. However, the training of ANFIS parameters is a critical problem in terms of real-world implementation. The main concerns of researchers in designing the ANFIS model is to update its parameters so that improved precision achieved efficiently. Several methods have developed for the training of these parameters. These methods generally classified as deterministic and probabilistic. Deterministic techniques, including gradient descent (GD) and least square estimator (LSE), are slow and will not converge in some cases. Moreover, the standard ANFIS training approaches use the gradient descent (GD) technique, so there are many local optimums since the chain rule used generates the gradient calculation at each step. In contrast, metaheuristic algorithms are population-based with the ability of global search. Each individual in the population expresses a potential solution. In particular, these algorithms have produced significant progress in many areas related to optimization. It can sometimes be a good solution for mitigating the limitations of exhaustive time-consuming search. Many metaheuristic algorithms, on the other hand, suffer from the optimal locality, lacking search diversity and imbalances between explosive and exploitative performance. In this thesis, we addressed several issues related to the IoT big data challenges. Firstly, a systematic way to review the IoT environment according to big data analytics together with limitations and challenges. Moreover, a cloud-fog-mist combination for handling IoT data concerning centralized and distributed data mining is explained. A proposed hybrid real-time remote patient monitoring framework introduced that consists of the integration among the mist, fog, and cloud for healthcare treatment, which remote-monitors patients continuously. Secondly, a novel hybrid metaheuristic algorithm is proposed. This algorithm uses both exploitation and exploration capabilities of the particle swarm optimization and the grey wolf optimization algorithms respectively. Moreover, a new binary variant of the wrapper feature selection grey wolf optimization and particle swarm optimization is proposed. The K-nearest neighbor classifier with Euclidean separation matrices is used to find the optimal solutions. A tent chaotic map helps in avoiding the algorithm from locked to the local optima problem. The sigmoid function employed for converting the search space from a continuous vector to a binary one to be suitable to the problem of feature selection. Cross-validation K-fold is used to overcome the overfitting issue. A variety of comparisons have been made with well-known and common algorithms, such as the particle swarm optimization algorithm, and the grey wolf optimization algorithm. Twenty datasets are used for the experiments, and statistical analyses are conducted to approve the performance and the effectiveness and of the proposed model with measures like selected features ratio, classification accuracy, and computation time. Finally, a proposed fog-based ANFIS+PSOGWO model provided for Parkinson’s disease prediction. The proposed model exploits the advantages of the grey wolf optimization (GWO) and the particle swarm optimization (PSO) for adjusting the adaptive neuro-fuzzy inference system (ANFIS) parameters with the use of chaotic tent map for the initialization. The fog processing utilized for gathering and analyzing the data at the edge of the gateways and notifying the local community instantly. Compared to other optimization methods, many evaluation metrics used like the root mean square error (RMSE), the mean square error (MSE), the standard deviation (SD), and the accuracy and five standard datasets from repository of UCI machine learning that demonstrated the superiority of the model proposed against the grey wolf optimization (GWO), the particle swarm optimization (PSO), the differential evolution (DE), the genetic algorithm (GA), the ant colony optimization (ACO), and the standard ANFIS model. Moreover, the proposed ANFIS+PSOGWO applied for Parkinson’s disease prediction and achieved an accuracy of 89.3%. The proposed ANFIS+PSOGWO compared in producing positive outcomes better than PSO, GWO, GA, ACO, DE, and some recent literature for Parkinson’s disease prediction.