Author: Haweel, Reem Tarek./ Title: GPU Implementation of High Speed<br>JPEG Using Modern Signed DCT<br>Algorithms /

Search In this Thesis

العنوان

GPU Implementation of High Speed
JPEG Using Modern Signed DCT
Algorithms /

المؤلف

Haweel, Reem Tarek.

هيئة الاعداد

باحث / Reem Tarek Haweel

مشرف / Hassan Hassan Ramadan

مشرف / Wail Shawki El-Kilani

مناقش / Wail Shawki El-Kilani

تاريخ النشر

2015.

عدد الصفحات

85p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

علوم الحاسب الآلي

تاريخ الإجازة

1/1/2015

مكان الإجازة

اتحاد مكتبات الجامعات المصرية - Computer Systems

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Abstract
Recent developments in real time image and video processing
employed in multimedia and communication systems require fast image and
video compression techniques. Two dimensional Discrete Cosine Transforms
(2D-DCT) are widely used in modern compression standards for their high
power compaction properties. Multiplier free approximate DCT transforms
have been developed to proceed faster than the original DCT while
maintaining comparative levels of power compaction. These approximate DCT
transforms may be efficiently realized in digital very large scale integration
hardware using addition, subtraction and shift operations. The approximate
DCT transforms may be also efficiently implemented on parallel computing
processors to achieve high speed up levels. The Graphics Processing Unit
(GPU) is the most powerful parallel processing tool especially with the
dedicated Compute Unified Device Architecture (CUDA) programming. GPU
and CUDA are efficiently employed in many modern image and video based
systems and computer applications. The GPU is much faster than the CPU for
dealing with large data as in the case of image and video processing. The work
presented in this thesis is twofold.
The first part introduces a multiplierless efficient and low complexity
8-point approximate DCT transform. The proposed transform is derived by
applying the signum function operator to a transform with high power
compaction capabilities. The signum function cancels out the shift operators
required by such transform and consequently reduces the hardware and
software errors and speeds up implementation. A fast implementation of the
proposed transform is provided. Only 17 additions are required for both
forward and backward 8-point transformations. The compaction and
II
compression properties for the proposed transform are demonstrated
through simulations employing data bases of gray images with different sizes.
It is shown that the proposed algorithm outperforms the most recent
competitive transforms.
In the second part of the thesis, a fast and efficient GPU
implementation for the proposed transform is provided employing CUDA
programming. The details of the proposed GPU architecture and the
employed CUDA modules are investigated. Performance evaluations show
that the suggested implementation for the proposed transform explored in
the first part is much faster than other approximate DCT transforms in real
time Joint Photographic Experts Group (JPEG) like compression. The proposed
GPU implementation has achieved high speed ups over conventional CPU
implementation. The achieved speedup ranges from x151 to x202 according
to different image sizes.