On the Sentence Embeddings from Pre-trained Language Models

  title={On the Sentence Embeddings from Pre-trained Language Models},
  author={Bohan Li and Hao Zhou and Junxian He and Mingxuan Wang and Yiming Yang and Lei Li},
  • Bohan Li, Hao Zhou, +3 authors Lei Li
  • Published in EMNLP 2020
  • Computer Science
  • Pre-trained contextual representations like BERT have achieved great success in natural language processing. However, the sentence embeddings from the pre-trained language models without fine-tuning have been found to poorly capture semantic meaning of sentences. In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We first reveal the theoretical connection between the masked language model pre-training objective and the semantic similarity task… CONTINUE READING
    5 Citations

    Figures and Tables from this paper

    COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
    • 1
    • PDF
    BERT Goes Shopping: Comparing Distributional Models for Product Representations
    • PDF
    How to Learn when Data Reacts to Your Model: Performative Gradient Descent
    • PDF
    The Curse of Dense Low-Dimensional Information Retrieval for Large Index Sizes
    • PDF
    Localized Calibration: Metrics and Recalibration
    • PDF


    A Simple but Tough-to-Beat Baseline for Sentence Embeddings
    • 741
    Universal Sentence Encoder
    • 630
    • PDF
    Language Models are Unsupervised Multitask Learners
    • 2,450
    • PDF
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    • 14,560
    • PDF
    Improving Neural Language Generation with Spectrum Control
    • 5
    • PDF
    XLNet: Generalized Autoregressive Pretraining for Language Understanding
    • 2,030
    • PDF
    Representation Degeneration Problem in Training Natural Language Generation Models
    • 16
    • PDF
    Attention is All you Need
    • 15,791
    • Highly Influential
    • PDF