Search In this Thesis
   Search In this Thesis  
العنوان
Acceleration of Deep Neural Networks for Image
Classification Applications/
المؤلف
Alabassy,Bassma Said Helmy
هيئة الاعداد
باحث / بسمة سعيد حلمي العباسي
مشرف / محمد واثق على كامل الخراشى
مناقش / حسام على حسن فهمى
مناقش / محمود إبراھيم خليل
تاريخ النشر
2020.
عدد الصفحات
116 p.:
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الهندسة الكهربائية والالكترونية
تاريخ الإجازة
1/1/2020
مكان الإجازة
جامعة عين شمس - كلية الهندسة - كهرباء حاسبات
الفهرس
Only 14 pages are availabe for public view

from 129

from 129

Abstract

Deep Neural Networks are a recent trend that is currently explored as an innovative solu-tion to complex problems in industrial markets and research as well. Image classification is from the core domains of deep learning that achieved signi cant enhancements in the accuracy of classification. However, these improvements are a result of going deeper in network design incorporating more network layers, which adds on the computational load of the network. Thus, the increasing sizes of today’s networks are causing a huge bottleneck in the networks’ training time and inference time.
Hardware acceleration is used to reduce the overhead resulting from computations by moving the required tasks during training and inference from CPUs to a hardware platform. These hardware platforms include domain speci c architectures that are specially designed for similar network loads. The softmax layer is a well-known type of non-linear activation layers. It is considered a key layer not only in most image classification net-works, but also in other classification domains on a more general level. As the softmax layer is composed of complicated exponents and includes multiple division arithmetic operations, its acceleration is a challenging task to achieve efficiently.
The purpose of this thesis is to propose an optimized architecture for softmax layer to be used in hardware acceleration for any image classification task of multiple categories. The target of the hardware model is to preserve the classification accuracy and achieve a balance for the trade-o between the design performance and its resulting cost. There are multiple schemes used in the design to optimize the load in computations by observing the input patterns to the softmax layer. Those patterns are used as a foundation for selecting the suitable input-downscaling method. The area overhead is also optimized by reducing the accuracy of some arithmetic operations based on their contribution to the classification accuracy from the mathematical representation.
The architecture of the model proposed in this thesis is implemented in verilog HDL. A setup for assessment of the model is also implemented to provide sensible estimation of performance. from the open standard benchmarks available, a dataset is selected and used for the assessment in comparison with a standard reference and previous implementations of the layer. The methods used in this work resulted in achieving a 99.13% accuracy for the classifier using the hardware layer, where this accuracy level is the same predictive accuracy obtained by the reference layer.