Author: Mostafa, Ahmed Hesham Mostafa ./ Title: A Deep Learning Approach for Recognizing Facial Expression /

Search In this Thesis

العنوان

A Deep Learning Approach for Recognizing Facial Expression /

المؤلف

Mostafa, Ahmed Hesham Mostafa .

هيئة الاعداد

باحث / Ahmed Hesham Mostafa Mostafa

مشرف / Mohamed Abdel-Fattah Belal

مشرف / Hala Abdel-Galil El-Sayed

الموضوع

Computers and information. Computer Science. Deep Learning.

تاريخ النشر

2021.

عدد الصفحات

210 p. :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Information Systems

تاريخ الإجازة

1/1/2021

مكان الإجازة

جامعة حلوان - كلية الحاسبات والمعلومات - Computer Science

الفهرس

Only 14 pages are availabe for public view

from

210

from

210

Abstract

Because of its potential applications, Facial Expression Recognition (FER) is one of the most exciting problems in computer science. Many studies were proposed for the FER. However, traditional machine learning techniques are the basis of these studies, and These techniques are not generalizable enough to classify expressions from unseen images or those captured in the wild. Recently, research trends have transferred to deep learning techniques since they can learn and capture features automatically, are resistant to natural variations in the data, and are generalizable. This study compares popular convolutional neural network (CNN) models used to solve the FER problem, including CNN models based on modern CNN architectures such as ResNet, DenseNet, MobileNet, NASNetMobile, Inception, and Xception.
In addition, we proposed a new approach to recognize facial expressions for an imbalanced dataset such as AffectNet by dividing expressions into groups based on the number of samples in each class. After that, we recognize each group of expressions using Inceptionv3. The model aggregates all models’ output to recognize eight facial expressions for static RGB images. In addition, we proposed to weigh the categorical cross-entropy (CCE) loss function by computing the balanced class weights where the proposed ensemble inceptionv3 model achieved an accuracy of 58%.
Additionally, we proposed a new model called CNNCraft-net based on combining the advantages of CNN and traditional models. CNNCraft-net concatenates feature outputs from CNN, Autoencoder, and handcrafted features such as SIFT, SURF, and ORB computed by the bag of visual words (BOVW) to recognize eight facial expressions for static RGB images. Multiple metrics, including accuracy, loss, F1-score, precision, and recall, were used for comparative analysis.
This study used multiple metrics for the comparative analysis, such as Categorical Accuracy, Loss (CEE), precision, recall, confusion matrix, and f1-score. In addition, this study used AffectNet and FER2013 datasets to evaluate the proposed model, where the proposed CNNCraft-net model achieves an accuracy of 61.9% for eight expressions and 65% for seven expressions for AffectNet 69% for FER2013.