الفهرس | Only 14 pages are availabe for public view |
Abstract Most of recent applications such as sensor generate continuous and time varying data which called data streams. Additional constraints are faced for efficient query processing of such data streams that have uncertain nature and require fast and timely processing . Traditional quary processing them , which is not applicable to data streams . applying to data streams . applying data clustering is demanded as a preprocessing step of data streams . Also, data streams are often suffer from incompleteness and high dimensionality . So, in this thesis, we introduce a framework for efficiently answering incomplete high dimensional data streams queries. The proposed framework handles the incompleteness issue by estimating missed values based on the corredoonding nearest-neighbors’ intervals. The continuous clustering mechanism is adopted and extended to accurately handle the incomplete data streams. The alternative aoppoaches using two different data sets. The proposed framework provides an improved subspace clustering to deal with high dimensional data strams. The experimental results using two datasets proved the efficiency of the proposed framework on average by 7.9% over the comparing algorithms for clustering such incomplete high dimensional data streams was improved by 62%using two different data sets over the compared algorithms due the proposed clustering improvemennts of such data. |