Share This Author
Neural Architectures for Named Entity Recognition
- Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer
- 4 March 2016
Comunicacio presentada a la 2016 Conference of the North American Chapter of the Association for Computational Linguistics, celebrada a San Diego (CA, EUA) els dies 12 a 17 de juny 2016.
Cross-lingual Language Model Pretraining
This work proposes two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingsual language model objective.
Word Translation Without Parallel Data
- Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Herv'e J'egou
- Computer ScienceICLR
- 11 October 2017
It is shown that a bilingual dictionary can be built between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way.
XNLI: Evaluating Cross-lingual Sentence Representations
This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines.
Unsupervised Machine Translation Using Monolingual Corpora Only
This work proposes a model that takes sentences from monolingual corpora in two different languages and maps them into the same latent space and effectively learns to translate without using any labeled data.
Phrase-Based & Neural Unsupervised Machine Translation
- Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato
- Computer ScienceEMNLP
- 20 April 2018
This work investigates how to learn to translate when having access to only large monolingual corpora in each language, and proposes two model variants, a neural and a phrase-based model, which are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters.
What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties
- Alexis Conneau, Germán Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni
- Computer ScienceACL
- 3 May 2018
10 probing tasks designed to capture simple linguistic features of sentences are introduced and used to study embeddings generated by three different encoders trained in eight distinct ways, uncovering intriguing properties of bothencoders and training methods.
Fader Networks: Manipulating Images by Sliding Attributes
- Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato
- Computer ScienceNIPS
- 1 June 2017
A new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space is introduced, which results in much simpler training schemes and nicely scales to multiple attributes.
The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English
This work introduces the FLORES evaluation datasets for Nepali–English and Sinhala– English, based on sentences translated from Wikipedia, and demonstrates that current state-of-the-art methods perform rather poorly on this benchmark, posing a challenge to the research community working on low-resource MT.
Multiple-Attribute Text Rewriting
- Guillaume Lample, Sandeep Subramanian, Eric Michael Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau
- Computer ScienceICLR
- 27 September 2018
This paper proposes a new model that controls several factors of variation in textual data where this condition on disentanglement is replaced with a simpler mechanism based on back-translation, and demonstrates that the fully entangled model produces better generations.