Search In this Thesis
   Search In this Thesis  
العنوان
Web Objects Classification with Social Tags Exploration /
المؤلف
Hedar, Lamiaa Mohamed Rady.
هيئة الاعداد
باحث / لمياء محمد راضي حيدر
مشرف / محب رمزي جرجس
مشرف / أحمد سويلم أحمد
الموضوع
Social sciences - Computer simulation. Social sciences - Mathematical mode.
تاريخ النشر
2018.
عدد الصفحات
117 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
علوم الحاسب الآلي
تاريخ الإجازة
1/1/2018
مكان الإجازة
جامعة المنيا - كلية العلوم - علوم الحاسب
الفهرس
Only 14 pages are availabe for public view

from 132

from 132

Abstract

Web pages may contain, in addition to textual web objects, various non-textual web objects, such as photos, videos, and products. The automatic classification of these web objects into semantic categories is very important to facilitate indexing, browsing, searching, and mining these objects. But this is a very challenging task, because web objects often suffer from a lack of easy-extractable features with semantic information, interconnections between each other, and training examples with category labels.
Social network systems allow users to use descriptive tags to annotate the web objects that they are interested in. These tags are effectively utilized for information sharing and retrieval. A study of a large amount of user-generated tags in social network systems, such as del.icio.us, revealed that in general, user-generated tags are consistent with the web objects they are attached to, while more concise and closer to the understanding and judgments of users about the objects. Hence, social tags reflect the semantics of the web objects from users’ points of view, using a ubiquitous vocabulary for heterogeneous domains of objects. This property makes social tags an ideal type of data, which overcomes the difficulties of web object classification.
In this thesis, we explore the impact of using social tagging on the performance of text classification techniques in web objects classification. An automated system for web objects classification has been developed mainly to assist us in the social tags exploration task. However, it can be used as a web objects classification system that allows users to build and evaluate a predictive model using one of the classification methods, such as Decision Tree, Support Vector Machine or Naïve Bayes. This model can be later used to assign labels to web objects based on social tags and other features or attributes, such as URLs and titles. It can also be used to apply several different learners to a data set and compare their performance in order to choose one for prediction.
The developed system has three phases: data preprocessing, classification and evaluation phases. It accepts a training dataset that represents a set of web pages with its URLs, tags, titles and categories. Using this dataset, the system constructs a predictive model that is later used to assign labels to web objects based on their tags. In the classification step, the system employs three known text classification techniques namely, Support Vector Machine, Naïve Bayes, and Decision Tree, through the WEKA software.
Experiments have been conducted to evaluate the effectiveness of using social tags with each one of the three text classification techniques in web objects classification. The experimental results indicate that using tags significantly improve the classification performance, and that the classification methods perform better with tags than with other web objects features, such as titles. The results also indicate that the Support Vector Machine with tags is the best of all, and Naïve Bayes with tags works well with small percentage of training data.