• Publications
  • Influence
Arabic Offensive Language on Twitter: Analysis and Experiments
TLDR
We introduce a method for building a large Arabic offensive tweet dataset that is not biased by topic, dialect, or target. Expand
  • 56
  • 17
  • PDF
Improved Spelling Error Detection and Correction for Arabic
TLDR
A spelling error detection and correction application is based on three main components: a dictionary (or reference word list), an error model and a language model. Expand
  • 35
  • 3
  • PDF
Multilingual Code-switching Identification via LSTM Recurrent Neural Networks
TLDR
This paper describes the HHU-UH-G system for language identification in code-switched tweets for both SpanishEnglish and MSA-Egyptian dialect. Expand
  • 38
  • 3
  • PDF
CogALex-V Shared Task: GHHH - Detecting Semantic Relations via Word Embeddings
TLDR
This paper describes our system submission to the CogALex-2016 Shared Task on Corpus-Based Identification of Semantic Relations. Expand
  • 13
  • 3
  • PDF
Highly Effective Arabic Diacritization using Sequence to Sequence Modeling
TLDR
We present a unified character level sequence-to-sequence deep learning model that recovers both types of diacritics without the use of explicit feature engineering. Expand
  • 12
  • 3
  • PDF
Arabic Word Generation and Modelling for Spell Checking
TLDR
Arabic is a language known for its rich and complex morphology. Expand
  • 38
  • 2
  • PDF
Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks
TLDR
This paper describes a language-independent model for multi-class sentiment analysis using a simple neural network architecture of five layers (Embedding, Conv1D, GlobalMaxPooling and FullyConnected). Expand
  • 18
  • 2
  • PDF
Learning from Relatives: Unified Dialectal Arabic Segmentation
TLDR
We build a unified segmentation model where the training data for different dialects are combined and a single model is trained. Expand
  • 16
  • 2
  • PDF
Multi-Dialect Arabic POS Tagging: A CRF Approach
TLDR
We use a Conditional Random Fields (CRF) sequence labeler to train POS taggers for each dialect and examine the effect of cross and joint dialect training. Expand
  • 17
  • 2
  • PDF
SAWT: Sequence Annotation Web Tool
TLDR
We present SAWT, a web-based tool for the annotation of token sequences with an arbitrary set of labels. Expand
  • 6
  • 2
  • PDF
...
1
2
3
4
...