• Publications
  • Influence
Measuring Bias in Contextualized Word Representations
TLDR
We propose a template-based method to quantify bias in BERT, and show that this method obtains more consistent results in capturing social biases than the traditional cosine based method. Expand
  • 51
  • 5
  • PDF
Quantifying Social Biases in Contextual Word Representations
TLDR
We propose a template-based method to quantify bias in BERT and expose gender bias in a downstream task of Gender Pronoun Resolution. Expand
  • 10
  • 1
Creating a Translation Matrix of the Bible's Names Across 591 Languages
TLDR
We develop and release a novel resource of 1129 aligned Bible person and place names across 591 languages, which was constructed and improved using several approaches including weighted edit distance, machine-translation-based transliteration models, and affixal induction and transformation models. Expand
  • 5
  • 1
  • PDF
Learning Soft Labels via Meta Learning
TLDR
We introduce a meta-learning framework where the labels of training set are treated as learnable parameters and are learned along with the model parameters, and optimize them along with model parameters. Expand
  • 1
  • 1
  • PDF
Low-resource Cross-lingual Event Type Detection via Distant Supervision with Minimal Effort
TLDR
The use of machine learning for NLP generally requires resources for training. Expand
  • 8
  • PDF
The ARIEL-CMU Systems for LoReHLT18
TLDR
This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech. Expand
  • 5
  • PDF
Learning Disentangled Representation in Latent Stochastic Models: A Case Study with Image Captioning
TLDR
We present an approach to train latent stochastic encoder decoder models by incorporating joint continuous and discrete representation in the prior distribution. Expand
  • 2
Domain Adaptation of SRL Systems for Biological Processes
TLDR
In this paper, we propose two approaches for domain adaptation in the biological domain that involves pre-training LSTM-CRF based on existing large-scale datasets and adapting it for a low-resource corpus of biological processes. Expand
  • 2
  • PDF
Understanding the Needs of Prospective Tenants
TLDR
We explore the complexities of the information economy around rental housing, including information available to landlords but not tenants, as well as information tenants may not prioritize until after lease signing about their own desires. Expand
  • 2