Search In this Thesis
   Search In this Thesis  
العنوان
Efficient Spam Email Filtering Based on Artificial Intelligence Methods /
المؤلف
Sedky, Safaa Magdy Abdel-Hamid.
هيئة الاعداد
باحث / صفاء مجدي عبد الحميد صدقي
مشرف / ياسمين أبو السعود صالح متولي
مشرف / ميرفت ميخائيل راغب
mrvatmekhaeil@yahoo.com
مناقش / محمد عبد الحميد إسماعيل
drmaismail@gmail.com
مناقش / سلوى كمال عبد الحفيظ
الموضوع
Mathematics.
تاريخ النشر
2023.
عدد الصفحات
156 p. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
الرياضيات (المتنوعة)
تاريخ الإجازة
19/3/2023
مكان الإجازة
جامعة الاسكندريه - كلية الهندسة - الرياضيات والفيزياء الهندسية
الفهرس
Only 14 pages are availabe for public view

from 190

from 190

Abstract

Spam emailsrepresentathreattosecurityandcauseabigwasteintransmissiontimeand users timespentinreadingthem.Alotofbandwidthandlargestorageareconsumedby these spamemails,yieldingtofinanciallossesforinstitutionsandannoyingindividualusers. Another typeofmaliciousemailsisphishingemailsthataimtogetsensitiveinformation from usersleadingtocredentialtheft.Thisformsachallengingthreatinthecybersecurity domain. Manymachinelearning(ML)basedfiltersareusedtoclassifyemailsashamor spam emails.However,machinelearningclassifiersarevulnerabletoadversarialattacks, where anattackeraimstodeceiveanML-model.Hence,itisanimpellingneedtoprotect machine learningmodelsagainstsuchattacksunderdifferentattackscenarios. The aimofthisthesisistoconstructareal-timeandaccurateML-basedspamdetector,in both cleanandadversarialenvironments,capableofcompetingwiththestate-of-the-arttech- niques. Towardsthisend,weinvestigateseveralmachinelearningclassifiersasspamfilters and weendupproposinganartificialneuralnetwork(ANN)modelthatshowsimprovement in performancecomparedtorecentrelatedstudies.Fourbenchmarkdatasets;SpamBase, Phishing corpus, CSDMC2010andEnrondatasets,areutilizedinthestudyexperiments. Severalfeatureselectionmethodsarestudiedandtheeffectofthesemethodsontheclassi- fier performanceisdemonstrated. Differentperformancemeasuresareusedformodelvalidationandtesting.Additionally,the time consumedinbothofflinetrainingandonlinedetectionstagesisreported.Theproposed ANN-based classifierconsidersthevalidationaccuracyalongwiththetrainingaccuracy, achievingfastandcompetitiveperformancepromotingitsuseinpracticalscenarios.Based on conductedcomparativestudies,itbecomesapparentthattheproposedANN-basedspam filter outperformsotherstate-of-the-artML-basedfilters. Next,theresilienceofseveraltraditionalML-basedspamclassifierstoadversarialattacksis investigated.Usingre-trainingwithadversarialsamplesdefensetechnique,theML-based spam filtersperformanceissignificantlyimprovedachievinganaccuracycomparabletothe original oneinacleanenvironment. Extending theexperimentstoincludetheproposedANNmodel,differentattackscenariosare examinedincludingwhite-boxattacks(whichassumethattheattackerknowstheMLmodel) and black-boxattacks(thatassumetheML-modelisnotknowntotheattacker).Bothattacks during trainingtime(poisoningattacks)andthoseoccurringattestingtime(evasionattacks) are consideredintheintroducedexperiments.Theeffectofvaryingthestrengthoftheattack >basedspamfilterismonitoredaidedwithsecurity on theperformanceoftheproposedANN-evaluationcurves.Moreover,thevalidityofthetransferabilitypropertyofadversarialexam- ples acrossdifferentmodelsisdemonstrated,wheretheimpactoftheadversarialexamples on theoriginalmodel(surrogatemodel)isalmostthesameforothermodels(targetmodels). The experimentalresultsshowthattheproposedANN-basedspamfilterisnotonlysimple and efficient,butalsorobustagainstmanyevasionattacksandaselectedpoisoningattack.