![]() | Only 14 pages are availabe for public view |
Abstract Logistic Regression (LR), also known as Logit Regression or Logit Model, is a mathematical model used in statistics to estimate (guess) the probability of an event occurring having been given some previous data. LR works with binary data, where either the event happens (1) or the event does not happen (0). So given some feature x, it tries to find out whether some event y happens or not. Soycan either be 0 or 1. In the case where the event happens, y is given the value 1. The name LR is used when the dependent variable has only two values, such as 0 and 1 or Yes andNo.This thesis presents a brief review of eight imputation methods of missing data in the LR, has been done a Monte Carlo simulation study to examine the efficiency of eight imputation methods in the LR when the missingness mechanism is missing at random, and with different simulation factors, such as different sample sizes, missingness percent, and the number of independent variables. Moreover, we use real data on social network advertising, as an empirical study, to apply and examine these methods using criteria Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Area Under Curve (AUC), Root Mean Square Error (RMSE), andR-Squared(R^2 ).The results of our simulation and empirical studies indicated that Expectation-Maximization (EM) method is very appropriate.The results of the simulation study and the experimental application are identical where it has the smallest AIC and BICvalues, for estimating the missing data in the BLR, whether the missing data is in the independent variables only, the dependent variable only, or in both together |