الفهرس | Only 14 pages are availabe for public view |
Abstract A well-known fact is the cost to fix errors escalates as a project moves through its life cycle in an exponential fashion. Software assessment through identifying buggy classes as soon as they are committed to the Version Control system (VCS) would have a significant impact on reducing such cost especially for small to medium enterprises that experience limited resources, and strict deadlines. Mining in software repositories is a growing research area, where innovative techniques and models are designed to analyze software data and uncover useful information that can help in software assessment by using bug prediction. Previous studies show that Deep Learning has achieved remarkable results in many fields and it keeps evolving. In this thesis, an extension is recommended for the work proposed previously in “Software bug prediction using weighted majority voting techniques” by Sammar Moustafa Ibrahim Sayed et al. The proposed extension considers using a larger number of instances for the used datasets, studies the performance measures when applying feature selection and considers using the promising Deep Learning techniques. It was shown that applying feature selection, using a simple Filter approach, such as selecting the highly ranked 9 and 5 features out of the 17 features, slightly degraded the performance measures in most cases. In addition, implementing the Deep Learning model achieved higher performance measures than the selected set of base classifiers for small and balanced datasets. Moreover, the performance measures had slightly enhanced for Deep Learning on the large balanced dataset relative to its small balanced subset when no feature selection was applied and when feature selection was applied using highly ranked 9 features. Nevertheless, more investigation is required to study the performance of Deep Learning on large balanced datasets as well as small and large imbalanced datasets. Moreover, more experiments need to be conducted to examine if further hyperparameter tuning or, in case of imbalanced datasets, using oversampling, under-sampling, and/or changing the loss function to be more sensitive to the minority class would enhance the performance measures. |