• Publications
  • Influence
Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of Claims using Transformer-based Models
We introduce the strategies used by the Accenture Team for the CLEF2020 CheckThat! Lab, Task 1, on English and Arabic. This shared task evaluated whether a claim in social media text should be
The IUCL+ System: Word-Level Language Identification via Extended Markov Models
The IUCL+ system combines character n-gram probabilities, lexical probabilities, word label transition probabilities and existing named entity recognitiontools within a Markovmodel framework that weights these components and assigns a label.
Typing Race Games as a Method to Create Spelling Error Corpora
This paper presents a method to elicit spelling error corpora using an online typing race game, and compares the methodology against three existing spelling corpora for English.
Learning Arabic morphology using statistical constraint-satisfaction models
This paper describes machine learning models of unsupervised morphology acquisition using a constraint satisfaction approach and statistical evaluation to describe constraint satisfaction approaches to morphology acquisition.
Processing highly variant language using incremental model selection
This dissertation provides an architecture for NLP that allows for better handling of complicated language variation and finds that segmenting language before tagging, and then applying single-language homogeneous language models, is competitive to multilingual heterogeneous tagging models.
ArCADE: An Arabic Corpus of Auditory Dictation Errors
We present a new corpus of word-level listening errors collected from 62 native English speakers learning Arabic designed to inform models of spell checking for this learner population. While we use
On Induction of Morphology Grammars and its Role in Bootstrapping
This paper shows how unsupervised hypothesis generation with ABL algorithms can be used to induce a lexicon and morphological rules for various types of languages, e.
A Random Forest System Combination Approach for Error Detection in Digital Dictionaries
This work investigates automating the process of detecting errors in an XML representation of a digitized print dictionary using a hybrid approach that combines rule- based, feature-based, and language model-based methods.
Effects of listener experience with foreign accent on perception of accentedness and speaker age.
The current study examined the effects of foreign accent and listener experience on the perception of a speaker’s age and native language. Ten audio stimuli were prepared from the recording of five
Spelling Correction for Dialectal Arabic Dictionary Lookup
The “Did You Mean...?” system, described in this article, is a spelling corrector for Arabic that is designed specifically for L2 learners of dialectal Arabic in the context of dictionary lookup. The