Search In this Thesis
   Search In this Thesis  
العنوان
Handling missing values in logistic regression models /
الناشر
Muhammad Gamal Hemeida Ibrahim Ghallab ,
المؤلف
Muhammad Gamal Hemeida Ibrahim Ghallab
هيئة الاعداد
باحث / Muhammad Gamal Hemeida Ibrahim Ghallab
مشرف / Salah Mahdy Mohammed
مشرف / Mohammed Reda Abonazal
مناقش / Ahmed Amin El-Sheikh
مناقش / Mervat Mahdy Ramadan
تاريخ النشر
2021
عدد الصفحات
100 Leaves :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الإحصاء والاحتمالات
تاريخ الإجازة
24/10/2021
مكان الإجازة
اتحاد مكتبات الجامعات المصرية - Applied Statistics and Econometrics
الفهرس
Only 14 pages are availabe for public view

from 118

from 118

Abstract

Logistic Regression (LR), also known as Logit Regression or Logit Model, is a mathematical model used in statistics to estimate (guess) the probability of an event occurring having been given some previous data. LR works with binary data, where either the event happens (1) or the event does not happen (0). So given some feature x, it tries to find out whether some event y happens or not. Soycan either be 0 or 1. In the case where the event happens, y is given the value 1. The name LR is used when the dependent variable has only two values, such as 0 and 1 or Yes andNo.This thesis presents a brief review of eight imputation methods of missing data in the LR, has been done a Monte Carlo simulation study to examine the efficiency of eight imputation methods in the LR when the missingness mechanism is missing at random, and with different simulation factors, such as different sample sizes, missingness percent, and the number of independent variables. Moreover, we use real data on social network advertising, as an empirical study, to apply and examine these methods using criteria Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Area Under Curve (AUC), Root Mean Square Error (RMSE), andR-Squared(R^2 ).The results of our simulation and empirical studies indicated that Expectation-Maximization (EM) method is very appropriate.The results of the simulation study and the experimental application are identical where it has the smallest AIC and BICvalues, for estimating the missing data in the BLR, whether the missing data is in the independent variables only, the dependent variable only, or in both together