Arabic Named Entity Recognition via Deep Co-learning

Journal: Artificial Intelligence Review

Authors: Chadi Helwe, Shady Elbassuoni

Abstract:

Named entity recognition (NER) is an important natural language processing (NLP) task with many applications. We tackle the problem of Arabic NER using deep learning based on Arabic word embeddings that capture syntactic and semantic relationships between words. Deep learning has been shown to perform significantly better than other approaches for various NLP tasks including NER. However, deep-learning models also require a significantly large amount of training data, which is highly lacking in the case of the Arabic language. To remedy this, we adopt the semi-supervised co-training approach to the realm of deep learning, which we refer to as deep co-learning. Our deep co-learning approach makes use of a small amount of labeled data, which is augmented with partially labeled data that is automatically generated from Wikipedia. Our approach relies only on word embeddings as features and does not involve any additional feature engineering. Nonetheless, when tested on three different Arabic NER benchmarks, our approach consistently outperforms state-of-the-art Arabic NER approaches, including ones that employ carefully-crafted NLP features. It also consistently outperforms various baselines including purely-supervised deep-learning approaches as well as semi-supervised ones that make use of only unlabeled data such as self-learning and the traditional co-training approach.

Paper

Leave a Reply

Your email address will not be published. Required fields are marked *

I accept that my given data and my IP address is sent to a server in the USA only for the purpose of spam prevention through the Akismet program.More information on Akismet and GDPR.