Author: Muhammad Gamal Hemeida Ibrahim Ghallab/ Title: Handling missing values in logistic regression models /

Search In this Thesis

العنوان

Handling missing values in logistic regression models /

الناشر

Muhammad Gamal Hemeida Ibrahim Ghallab ,

المؤلف

Muhammad Gamal Hemeida Ibrahim Ghallab

هيئة الاعداد

باحث / Muhammad Gamal Hemeida Ibrahim Ghallab

مشرف / Salah Mahdy Mohammed

مشرف / Mohammed Reda Abonazal

مناقش / Ahmed Amin El-Sheikh

مناقش / Mervat Mahdy Ramadan

تاريخ النشر

2021

عدد الصفحات

100 Leaves :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الإحصاء والاحتمالات

تاريخ الإجازة

24/10/2021

مكان الإجازة

اتحاد مكتبات الجامعات المصرية - Applied Statistics and Econometrics

الفهرس

Only 14 pages are availabe for public view

from

118

from

118

Abstract

Logistic Regression (LR), also known as Logit Regression or Logit Model, is a mathematical model used in statistics to estimate (guess) the probability of an event occurring having been given some previous data. LR works with binary data, where either the event happens (1) or the event does not happen (0). So given some feature x, it tries to find out whether some event y happens or not. Soycan either be 0 or 1. In the case where the event happens, y is given the value 1. The name LR is used when the dependent variable has only two values, such as 0 and 1 or Yes andNo.This thesis presents a brief review of eight imputation methods of missing data in the LR, has been done a Monte Carlo simulation study to examine the efficiency of eight imputation methods in the LR when the missingness mechanism is missing at random, and with different simulation factors, such as different sample sizes, missingness percent, and the number of independent variables. Moreover, we use real data on social network advertising, as an empirical study, to apply and examine these methods using criteria Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Area Under Curve (AUC), Root Mean Square Error (RMSE), andR-Squared(R^2 ).The results of our simulation and empirical studies indicated that Expectation-Maximization (EM) method is very appropriate.The results of the simulation study and the experimental application are identical where it has the smallest AIC and BICvalues, for estimating the missing data in the BLR, whether the missing data is in the independent variables only, the dependent variable only, or in both together