Author: Muhammad Emadeldien Ahmed Mahmoud Khalifa/ Title: Transfer learning for natural language processing in low-resource scenarios /

Search In this Thesis

العنوان

Transfer learning for natural language processing in low-resource scenarios /

الناشر

Muhammad Emadeldien Ahmed Mahmoud Khalifa ,

المؤلف

Muhammad Emadeldien Ahmed Mahmoud Khalifa

هيئة الاعداد

مشرف / Muhammad Emadeldien Ahmed Mahmoud Khalifa

مشرف / Hesham Ahmed Hassan ,

مشرف / Aly Aly Fahmy

مناقش / Hesham Ahmed Hassan ,

تاريخ النشر

2021

عدد الصفحات

102 Leaves :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

2/10/2020

مكان الإجازة

جامعة القاهرة - كلية الحاسبات و المعلومات - Computer Science

الفهرس

Only 14 pages are availabe for public view

from

120

from

120

Abstract

Annotated data is necessary for supervised machine learning approaches. Unfortunately, data annotation is expensive, time-consuming, and requires domain expertise from the human labeler.Therefore, it is essential to develop methods that can operate in zero- and low-resource settings i.e., with no or little labeled data in the target task. In this work, we propose two transfer learning approaches based on inductive and transductive transfer.The inductive transfer approach leverages raw unlabeled data through pre-trained language models and obtains substantial performance gains on three natural language processing tasks, namely named entity recognition (NER), part-of-speech (POS) tagging, and sarcasm detection (SRD). However, the proposed inductive approach is only suitable when we have labeled data in the target language variety.Therefore, we shift our focus to zero- and low-resource settings where the goal is to build models that can generalize to completely unseen language varieties and frame our work in the scope of the Arabic language and three of its varieties (dialects), namely Egyptian, Gulf, and Levantine.Then, we develop a transductive transfer approach that allows transferringknowledge between different Arabic varieties without the need for labeled examples in the target variety. The proposed transductive approach enables knowledge transfer from resource-rich language varietiesto resource-poor ones and is based on self-training with unlabeled examples from the target language variety