Search In this Thesis
   Search In this Thesis  
العنوان
GPU Implementation of High Speed
JPEG Using Modern Signed DCT
Algorithms /
المؤلف
Haweel, Reem Tarek.
هيئة الاعداد
باحث / Reem Tarek Haweel
مشرف / Hassan Hassan Ramadan
مشرف / Wail Shawki El-Kilani
مناقش / Wail Shawki El-Kilani
تاريخ النشر
2015.
عدد الصفحات
85p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
علوم الحاسب الآلي
تاريخ الإجازة
1/1/2015
مكان الإجازة
اتحاد مكتبات الجامعات المصرية - Computer Systems
الفهرس
Only 14 pages are availabe for public view

from 16

from 16

Abstract

Abstract
Recent developments in real time image and video processing
employed in multimedia and communication systems require fast image and
video compression techniques. Two dimensional Discrete Cosine Transforms
(2D-DCT) are widely used in modern compression standards for their high
power compaction properties. Multiplier free approximate DCT transforms
have been developed to proceed faster than the original DCT while
maintaining comparative levels of power compaction. These approximate DCT
transforms may be efficiently realized in digital very large scale integration
hardware using addition, subtraction and shift operations. The approximate
DCT transforms may be also efficiently implemented on parallel computing
processors to achieve high speed up levels. The Graphics Processing Unit
(GPU) is the most powerful parallel processing tool especially with the
dedicated Compute Unified Device Architecture (CUDA) programming. GPU
and CUDA are efficiently employed in many modern image and video based
systems and computer applications. The GPU is much faster than the CPU for
dealing with large data as in the case of image and video processing. The work
presented in this thesis is twofold.
The first part introduces a multiplierless efficient and low complexity
8-point approximate DCT transform. The proposed transform is derived by
applying the signum function operator to a transform with high power
compaction capabilities. The signum function cancels out the shift operators
required by such transform and consequently reduces the hardware and
software errors and speeds up implementation. A fast implementation of the
proposed transform is provided. Only 17 additions are required for both
forward and backward 8-point transformations. The compaction and
II
compression properties for the proposed transform are demonstrated
through simulations employing data bases of gray images with different sizes.
It is shown that the proposed algorithm outperforms the most recent
competitive transforms.
In the second part of the thesis, a fast and efficient GPU
implementation for the proposed transform is provided employing CUDA
programming. The details of the proposed GPU architecture and the
employed CUDA modules are investigated. Performance evaluations show
that the suggested implementation for the proposed transform explored in
the first part is much faster than other approximate DCT transforms in real
time Joint Photographic Experts Group (JPEG) like compression. The proposed
GPU implementation has achieved high speed ups over conventional CPU
implementation. The achieved speedup ranges from x151 to x202 according
to different image sizes.