Author: Emad Saddad Abdelhakiem Hussain/ Title: Towards a novel data warehouses architecture /

Search In this Thesis

العنوان

Towards a novel data warehouses architecture /

الناشر

Emad Saddad Abdelhakiem Hussain ,

المؤلف

Emad Saddad Abdelhakiem Hussain

هيئة الاعداد

باحث / Emad Saddad Abdelhakiem Hussain

مشرف / Hoda Mokhtar Omar Mokhtar

مشرف / Osman Hegazy

مشرف / Ali Hamed Elbastawesy

تاريخ النشر

2021

عدد الصفحات

74 Leaves :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Information Systems

تاريخ الإجازة

14/11/2021

مكان الإجازة

جامعة القاهرة - كلية الحاسبات و المعلومات - Information Systems

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Traditional Data Warehouse (DW) is a centralized data repository of non-volatile, subject-oriented, non-operational, integrated, and time variant data that integrates data from different heterogeneous data sources. DW is specifically developed for supporting decision making, analysis, data mining, and ad hoc queries.The structure and the volume of data stored on computer systems have recently been growing at an accelerated rate.Traditional DW has several problems to cope with such environments, such as architecture based on relational Database Management Systems (DBMSs), increasing their data volume, high disc space usage, slow query response time, and complicated administration. Furthermore, DWs depend on a static number of external data sources that may be incomplete, do not use the same definitions, and not always available. Therefore, there is an essential need to adjust traditional DW architecture to meet modern challenges imposed by data massiveness and current big data aspects. Further, a new architecture needs to address existing drawbacks such as availability, scalability, and efficiency of queries.This thesis introduces a novel DW architecture, called Lake Data Warehouse Architecture, to provide the capability to resolve the previously mentioned challenges for traditional DW. Lake Data Warehouse Architecture depends on integrating existing DW architecture with advanced technologies, such as the Hadoop framework and Apache Spark, in a novel and efficient hybrid solution. The main advantage of the proposed Lake Data Warehouse Architecture is that it combines the existing features in traditional DWs together with the big data features through joining the traditional DW with Hadoop and Spark ecosystems. Besides, it is suited to handle massive amounts of data while maintaining reliability, scalability, and availability