Author: Zaher, Mahmoud Ahmed Mahmoud./ Title: Information system for plagiarism detection in electronic resources /

Search In this Thesis

العنوان

Information system for plagiarism detection in electronic resources /

المؤلف

Zaher, Mahmoud Ahmed Mahmoud.

هيئة الاعداد

باحث / محمود أحمد محمود زاهر

مشرف / فرحات فرج فرحات

مشرف / عبد العزيز إبراھيم عبد الخالق شھاب

مشرف / حازم مختار البكري

مناقش / إبراهيم السيد إبراهيم كشك

مناقش / محمد السيد محمد حجاج

الموضوع

Computer science. Plagiarism. Information technology.

تاريخ النشر

2019.

عدد الصفحات

143 p. :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Information Systems

الناشر

تاريخ الإجازة

01/01/2019

مكان الإجازة

جامعة المنصورة - كلية الحاسبات والمعلومات - Department of Information Systems

الفهرس

Only 14 pages are availabe for public view

from

158

from

158

Abstract

Being a growing problem, plagiarism is generally defined as a“ literary theft ” and an “ academic dishonesty ” in the literature, and it is really has to be well-informed on this topic to prevent the problem and stick to the ethical principles. With the hug of the information on WWW and digital libraries, Plagiarism became one of the most important issues for universities, schools and researcher’s fields. It is so easy through the internet and due to using advanced search engine to find documents or journals by students. So plagiarism is a global problem, which occurs in many different areas of our life. It is pivotal to mention here that detecting plagiarism is a challenging task. Many language-sensitive tools for detecting plagiarism in natural language documents have been developed, particularly for English. Language-independent tools exist as well, but are considered restrictive as they usually do not take into account specific language features. Detecting plagiarism in Arabic documents is particularly a challenging task because of the complex linguistic structure of Arabic. This thesis, presents a plagiarism detection model , Abstract Syntax Tree Arabic Plagiarism (ASTAP), built upon a content-based method. It describes its main components including its preprocessing stage, and a heuristic algorithm for comparing documents at different logical levels (document, paragraph, and sentence levels). We evaluate it experimentally on a large set of Arabic documents and compare it with particularly Turnitin, a language-independent tool.
ASTAP is a plagiarism detection model for Arabic text-based documents, which is considered to be a primary work dedicated for plagiarism of Arabic based documents. Arabic is a rich morphological language and is established among the top used languages in the world and in the Internet as well. Given a document and a set of suspected files, The thesis’s goal is how to measure the similarity between Arabic documents as a source documents with the others available on the internet as a target documents by using an information system and compare it with the available tools considering complex environment, big data, online text. And how to compute the originality value of the examined document. The originality value of a text is verified by computing the distance between each sentence in the text and the closest sentence in the suspected files, if exists. The proposed model structure is based on a search engine in order to reduce the cost of pairwise similarity.