• Publications
  • Influence
Olá, Bonjour, Salve! XFORMAL: A Benchmark for Multilingual Formality Style Transfer
TLDR
Results on XFORMAL suggest that state-of-the-art style transfer approaches perform close to simple baselines, indicating that style transfer is even more challenging when moving multilingual.
The University of Maryland’s Kazakh-English Neural Machine Translation System at WMT19
TLDR
The University of Maryland’s submission to the WMT 2019 Kazakh-English news translation task is described, which improves over a Kazakh-only baseline by +5.45 BLEU on newstest2019.
A Review of Human Evaluation for Style Transfer
TLDR
It is found that protocols for human evaluations are often underspecified and not standardized, which hampers the reproducibility of research in this field and progress toward better human and automatic evaluation methods.
Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank
TLDR
A training strategy for multilingual BERT models by learning to rank synthetic divergent examples of varying granularity is introduced, which helps detect fine-grained sentence-level divergences more accurately than a strong sentence- level similarity model.
Mixture of Topic-Based Distributional Semantic and Affective Models
TLDR
A semantic mixture model is proposed enabling the combination of word similarity scores estimated across multiple topic-specific DSMs (TDSMs) achieving state-of-the-art results for semantic similarity estimation and sentence-level polarity detection.
Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation
TLDR
A divergent-aware NMT framework is introduced that uses factors to help NMT recover from the degradation caused by naturally occurring divergences, improving both translation quality and model calibration on EN-FR tasks.
Cross-Topic Distributional Semantic Representations Via Unsupervised Mappings
TLDR
This work proposes a DSM that learns multiple distributional representations of a word based on different topics, and evaluation on NLP downstream tasks shows that multiple topic-based embeddings outperform single-prototype models.