Carnegie Mellon University
Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
Style Transfer Through Back-Translation
A latent representation of the input sentence is learned which is grounded in a language translation model in order to better preserve the meaning of the sentence while reducing stylistic properties, and adversarial generation techniques are used to make the output match the desired style.
Metaphor Detection with Cross-Lingual Model Transfer
We show that it is possible to reliably discriminate whether a syntactic construction is meant literally or metaphorically using lexical semantic features of the words that participate in the…
Massively Multilingual Word Embeddings
- Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith
- Computer ScienceArXiv
- 5 February 2016
New methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space are introduced and a new evaluation method is shown to correlate better than previous ones with two downstream tasks.
Measuring Bias in Contextualized Word Representations
- Keita Kurita, Nidhi Vyas, Ayush Pareek, A. Black, Yulia Tsvetkov
- Computer ScienceProceedings of the First Workshop on Gender Bias…
- 18 June 2019
A template-based method to quantify bias in BERT is proposed and it is shown that this method obtains more consistent results in capturing social biases than the traditional cosine based method.
Sparse Overcomplete Word Vector Representations
- Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, Noah A. Smith
- Computer ScienceACL
- 5 June 2015
This work proposes methods that transform word vectors into sparse (and optionally binary) vectors, which are more similar to the interpretable features typically used in NLP, though they are discovered automatically from raw corpora.
Morphological Inflection Generation Using Character Sequence to Sequence Learning
This work model the problem of inflection generation as a character sequence to sequence learning problem and presents a variant of the neural encoder-decoder model for solving it, which is language independent and can be trained in both supervised and semi-supervised settings.
Problems With Evaluation of Word Embeddings Using Word Similarity Tasks
It is suggested that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.
Balancing Training for Multilingual Neural Machine Translation
Experiments show the proposed method not only consistently outperforms heuristic baselines in terms of average performance, but also offers flexible control over the performance of which languages are optimized.
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings
This work proposes a method to debias word embeddings in multiclass settings such as race and religion, extending the work of (Bolukbasi et al., 2016) from the binary setting, such as binary gender.
Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies
- Anjalie Field, Doron Kliger, S. Wintner, Jennifer Pan, Dan Jurafsky, Yulia Tsvetkov
- Computer ScienceEMNLP
- 1 August 2018
This work introduces embedding-based methods for cross-lingually projecting English frames to Russian, and offers new ways to identify subtle media manipulation strategies at the intersection of agenda-setting and framing.