On the Sentence Embeddings from Pre-trained Language Models
@inproceedings{Li2020OnTS, title={On the Sentence Embeddings from Pre-trained Language Models}, author={Bohan Li and Hao Zhou and Junxian He and Mingxuan Wang and Yiming Yang and Lei Li}, booktitle={EMNLP}, year={2020} }
Pre-trained contextual representations like BERT have achieved great success in natural language processing. However, the sentence embeddings from the pre-trained language models without fine-tuning have been found to poorly capture semantic meaning of sentences. In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We first reveal the theoretical connection between the masked language model pre-training objective and the semantic similarity task… CONTINUE READING
Figures and Tables from this paper
5 Citations
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
- Computer Science
- ArXiv
- 2021
- 1
- PDF
BERT Goes Shopping: Comparing Distributional Models for Product Representations
- Computer Science
- ArXiv
- 2020
- PDF
How to Learn when Data Reacts to Your Model: Performative Gradient Descent
- Computer Science, Mathematics
- ArXiv
- 2021
- PDF
The Curse of Dense Low-Dimensional Information Retrieval for Large Index Sizes
- Computer Science
- ArXiv
- 2020
- PDF
References
SHOWING 1-10 OF 36 REFERENCES
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
- Computer Science
- EMNLP
- 2017
- 1,136
- PDF
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Computer Science
- NAACL-HLT
- 2019
- 14,560
- PDF
XLNet: Generalized Autoregressive Pretraining for Language Understanding
- Computer Science
- NeurIPS
- 2019
- 2,030
- PDF
Representation Degeneration Problem in Training Natural Language Generation Models
- Computer Science
- ICLR
- 2019
- 16
- PDF
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
- Computer Science
- EMNLP/IJCNLP
- 2019
- 78
- PDF