• Publications
  • Influence
Multilingual Universal Sentence Encoder for Semantic Retrieval
TLDR
On transfer learning tasks, the multilingual embeddings approach, and in some cases exceed, the performance of English only sentence embedDings.
Character-Level Language Modeling with Deeper Self-Attention
TLDR
This paper shows that a deep (64-layer) transformer model with fixed context outperforms RNN variants by a large margin, achieving state of the art on two popular benchmarks: 1.13 bits per character on text8 and 1.06 on enwik8.
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
TLDR
The embedding models are trained to produce similar representations exclusively for bilingual sentence pairs that are translations of each other using a novel training method that introduces hard negatives consisting of sentences that are not translations but have some degree of semantic similarity.
Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax
TLDR
An approach to learn multilingual sentence embeddings using a bi-directional dual-encoder with additive margin softmax is presented, able to achieve state-of-the-art results on the United Nations (UN) parallel corpus retrieval task.
MultiReQA: A Cross-Domain Evaluation forRetrieval Question Answering Models
TLDR
This dataset paper presents MultiReQA, a new multi-domain ReQA evaluation suite composed of eight retrieval QA tasks drawn from publicly available QA datasets, which explores systematic retrieval based evaluation and transfer learning across domains over these datasets using a number of strong base-lines.
Hierarchical Document Encoder for Parallel Corpus Mining
TLDR
The results show document embeddings derived from sentence-level averaging are surprisingly effective for clean datasets, but suggest models trained hierarchically at the document-level are more effective on noisy data.
Wiki-40B: Multilingual Language Model Dataset
TLDR
A new multilingual language model benchmark that is composed of 40+ languages spanning several scripts and linguistic families with around 40 billion characters is proposed, and the task of multilingual causal language modeling is introduced.
Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation
TLDR
A supervised data mining method using an accurate early fusion model to improve the training of an efficient late fusion retrieval model that significantly outperforms retrieval models directly trained with gold annotations on Precision at $N$ (P@N) and Mean Reciprocal Rank (MRR).
TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling
TLDR
This work adapts T5 (Raffel et al., 2020), a strong pretrained text-to-text model, to extract a style vector from text and use it to condition the decoder to perform style transfer, and recast transfers as “targeted restyling” vector operations that adjust specific attributes of the input while preserving others.
TextSETTR: Label-Free Text Style Extraction and Tunable Targeted Restyling
TLDR
This work shows that T5 (Raffel et al., 2020), a strong pretrained text-to-text model, can be adapted to extract a style vector from arbitrary text and use this vector to condition the decoder to perform style transfer, and recast transfers as "targeted restyling" vector operations that adjust specific attributes of the input text while preserving others.
...
1
2
...