الفهرس | Only 14 pages are availabe for public view |
Abstract Data management has changed as a result of the mass production of data and the emergence of cloud computing. Many applications have to interact with a variety of heterogeneous data stores, depending on the type of data they have to manage: relational data types, documents, graph data from social networks, simple key value data, etc. Developers of heterogeneous data models, APIs, and applications that rely on multiple data stores face many challenges. As different APIs are used by different systems, programmers must be proficient in handling them. Developers find it difficult and it will be much easier to develop, deploy, and migrate multiple data storage applications in cloud environments with one integrated model algorithm-tool set. In this thesis, we propose A Framework for Automatic Processing Multi-Store Queries (FAPMQ) that can be used in cloud environments and any bigdata applications as an automatic tool to assist developers in managing basic and complex database queries. FAPMQ composed of three layers input layer, Matching selector layer, and processing and output layer. The matching selector layer examines five queries of user vector queries with others stored queries in the framework libraries to define which database SQL or NoSQL engine has to been used. An algorithm has been developed for this purpose. The algorithm creates two matrices, the first of which is liberary_QueryMatrix [n_LiberaryQueryEngine+1, nEngine] while the second is matchedQueryMatix [2, NEngine]. The selected database engine is the one that have five matched queries with the other five Queries in liberary_QueryMatrix. Hadoop and Spark environments have been used to speed up the execution time of process. The third layer is responsible for executing the queries. FAPMQ is evaluated under different queries in SQL and NoSQL database engines. FAPMQ is evaluated for querying different NoSQL databases in terms of optimization performance and query execution time using reference datasets for three scenarios. Experimental work showed that FAPMQ achieves the best 98% accuracy when using ego-Facebook dataset over retrieving 500,000 records of data and execution time in 1.92 seconds less time among the comparable systems. |