Search In this Thesis
   Search In this Thesis  
العنوان
Predicting itemsets’support of association rules mining/
الناشر
Shenoda Nabil Atalla Guirguis,
المؤلف
Guirguis,Shenoda Nabil Atalla,
الموضوع
Data base. Computer science.
تاريخ النشر
2006
عدد الصفحات
ii-xii+65 P.:
الفهرس
Only 14 pages are availabe for public view

from 77

from 77

Abstract

Many efforts have been dedicated to improve the efficiency of the data mining techniques. The idea of data mining, the basic step in the knowledge discovery in databases, is to extract the information hidden in a large set of transactional or relational database in the form of associations, clusters, trends, classification, and outliers. It is agreed in the data mining literature to divide the association mining problem into two sub¬problems; dctecting frequent item sets, and using detccted itemsets to generate mining rules. Most research in data mining proposed techniques to improve the discovery process of frequent item sets process.
‎This thesis proposes a novel research dimension in the field of data mining, which is: predicting itemsets’ support, ahead before the arrival of the actual data. Different techniques could be utilized to perform prediction. In this thesis a time series analysis approach is used, to detect maximal-frequent-trend-patterns in the itemset’s history of supports, and predicts future trend, and hence future support. Once the itemset support is predicted, mining rules could be easily deduced. The proposed algorithm is called ”MFTP” which stands for: Maximal Frequent Trend Pattern. The MFTP algorithm works in an incremental environment, which applies the Efficient Counting Using TIDLists (ECUT) algorithm for the incremental mining, to detcct mining rules, and to update the HistOlY Log. The History Log refers to local summaries, which are the large and negative border item sets of each database increment, and global summary, which is the large and negative border item sets of the whole database. This History Log is used by the prediction technique to build the itemset’s trend sequence, which is analyzed to detect maximal¬frequent -trend-patterns.
‎The conducted performance study showed that the proposed technique performs very well; prediction accuracy and time responses are reasonable. One concludes that this approach can be important, for the information it provides, especially in the fields of bioinformatics and medicine where the temporal factor in associations is important.