DialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings

@article{Liu2021DialogueCSEDC,
  title={DialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings},
  author={Che Liu and Rui Wang and Jinghua Liu and Jian Sun and Fei Huang and Luo Si},
  journal={ArXiv},
  year={2021},
  volume={abs/2109.12599}
}
  • Che LiuRui Wang Luo Si
  • Published 26 September 2021
  • Computer Science
  • ArXiv
Learning sentence embeddings from dialogues has drawn increasing attention due to its low annotation cost and high domain adaptability. Conventional approaches employ the siamese-network for this task, which obtains the sentence embeddings through modeling the context-response semantic relevance by applying a feed-forward network on top of the sentence encoders. However, as the semantic textual similarity is commonly measured through the element-wise distance metrics (e.g. cosine and L2… 

Figures and Tables from this paper

Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings

Analysis shows that dial2vec obtains informative and discriminative embeddings for both interlocutors under the guidance of the conversational interactions and achieves the best performance when aggregating them through the interlocutor-level pooling strategy.

Imagination is All You Need! Curved Contrastive Learning for Abstract Sequence Modeling Utilized on Long Short-Term Dialogue Planning

A novel technique for generating semantically meaningful and conversational graph curved utterance embeddings that can be compared using cosine similarity, and how these forward-entailing language representations can be utilized for assessing the likelihood of sequences by the entailment strength.

Induce Spoken Dialog Intents via Deep Unsupervised Context Contrastive Clustering

This work first transforms pretrained LMs into conversational encoders with in-domain dialogs, then conducts context-aware contrastive learning to reveal latent intent semantics via the coherence from dialog contexts, and proposes a novel clustering method to iteratively refine the representation.

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings

Experiments on 7 semantic textual similarity benchmarks reveal that models trained with the additional non-linguistic (images/audio) contrastive objective lead to higher quality sentence embeddings, indicating that Transformer models are able to generalize better by doing a similar task in a multi-task fashion.

Incorporating the Rhetoric of Scientific Language into Sentence Embeddings using Phrase-guided Distant Supervision and Metric Learning

This work uses an existing academic phrase database to label sentences automatically with their functions and trains an embedding model to capture similarities and dissimilarities from a rhetorical perspective, demonstrating that the embeddings obtained are more advantageous than existing models when retrieving functionally similar sentences.

Learning Interpretable Latent Dialogue Actions With Less Supervision

This work presents a novel architecture for explainable modeling of task-oriented dialogues with discrete latent variables to represent dialogue actions based on variational recurrent neural networks and proposes a way to measure dialogue success without the need for expert annotation.

Learning Interpretable Latent Dialogue Actions With Less Supervision

This work presents a novel architecture for explainable modeling of task-oriented dialogues with discrete latent variables to represent dialogue actions based on variational recurrent neural networks and proposes a way to measure dialogue success without the need for expert annotation.

Contrastive Data and Learning for Natural Language Processing

This tutorial intends to help researchers in the NLP and computational linguistics community to understand this emerging topic and promote future research directions of using contrastive learning for NLP applications.

Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders

This work probes retrained multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and devise a novel method to expose this knowledge by additionally additionally Tuning multilingual models through inexpensive contrastive learning procedure, requiring only a small amount of word translation pairs.

Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems

The concept of full-duplex in telecommunication is used to demonstrate what a human-like interactive experience should be and how to achieve smooth turn-taking through three subtasks: user state detection, backchannel selection, and barge-in detection.

References

SHOWING 1-10 OF 37 REFERENCES

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations

Inspired by recent advances in deep metric learning (DML), this work carefully design a self-supervised objective for learning universal sentence embeddings that does not require labelled training data and closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders.

SimCSE: Simple Contrastive Learning of Sentence Embeddings

SimCSE is presented, a simple contrastive learning framework that greatly advances the state-of-the-art sentence embeddings and regularizes pre-trainedembeddings’ anisotropic space to be more uniform, and it better aligns positive pairs when supervised signals are available.

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT is presented, a Contrastive Framework for Self-Supervised SEntence Representation Transfer that adopts contrastive learning to fine-tune BERT in an unsupervised and effective way and achieves new state-of-the-art performance on STS tasks.

Modeling Multi-turn Conversation with Deep Utterance Aggregation

This paper formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation, and shows the model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.

The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service

A large-scale real scenario Chinese E-commerce conversation corpus, JDDC, with more than 1 million multi-turn dialogues, 20 million utterances, and 150 million words is constructed, which reflects several characteristics of human-human conversations, e.g., goal-driven, and long-term dependency among the context.

SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation

The STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017), providing insight into the limitations of existing models.

On the Sentence Embeddings from Pre-trained Language Models

This paper proposes to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective and achieves significant performance gains over the state-of-the-art sentence embeddings on a variety of semantic textual similarity tasks.

Learning Semantic Textual Similarity from Conversations

A novel approach to learn representations for sentence-level semantic similarity using conversational data and achieves the best performance among all neural models on the STS Benchmark and is competitive with the state-of-the-art feature engineered and mixed systems for both tasks.

ConveRT: Efficient and Accurate Conversational Representations from Transformers

The proposed ConveRT (Conversational Representations from Transformers), a pretraining framework for conversational tasks satisfying all the following requirements: it is effective, affordable, and quick to train, and promises wider portability and scalability for Conversational AI applications.

Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

This work evaluates its models on the task of predicting the next response in a conversation, and finds that modeling both context and participants improves prediction accuracy.