Dialog Intent Induction via Density-based Deep Clustering Ensemble
@article{Pu2022DialogII, title={Dialog Intent Induction via Density-based Deep Clustering Ensemble}, author={Jiashu Pu and Guandan Chen and Yongzhu Chang and Xiao-Xi Mao}, journal={ArXiv}, year={2022}, volume={abs/2201.06731} }
Existing task-oriented chatbots heavily rely on spoken language understanding (SLU) systems to determine a user’s utterance’s intent and other key information for fulfilling specific tasks. In real-life applications, it is crucial to occasionally induce novel dialog intents from the conversation logs to improve the user experience. In this paper, we propose the Density-based Deep Clustering Ensemble (DDCE) method for dialog intent induction. Compared to existing K-means based methods, our…
References
SHOWING 1-10 OF 27 REFERENCES
Dialog Intent Induction with Deep Multi-View Clustering
- Computer ScienceEMNLP
- 2019
This work introduces the dialog intent induction task and proposes alternating-view k-means (AV-KMEANS) for joint multi-view learning and clustering analysis, which can induce better dialog intent clusters than state-of-the-art unsupervised representation learning methods and standardmulti-view clustering approaches.
Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement
- Computer ScienceAAAI
- 2020
Constrained deep adaptive clustering with cluster refinement (CDAC+) is proposed, an end-to-end clustering method that can naturally incorporate pairwise constraints as prior knowledge to guide the clustering process.
Intent Mining from past conversations for Conversational Agent
- Computer ScienceCOLING
- 2020
This paper presents an intent discovery framework that can mine a vast amount of conversational logs and to generate labeled data sets for training intent models, and introduced an extension to the DBSCAN algorithm and a density-based clustering algorithm ITER-DBSCAN for unbalanced data clustering.
Benchmarking Natural Language Understanding Services for building Conversational Agents
- Computer ScienceIWSDS
- 2019
The results show that on Intent classification Watson significantly outperforms the other platforms, namely, Dialogflow, LUIS and Rasa; though these also perform well; and Interestingly, on Entity Type recognition, Watson performs significantly worse due to its low Precision.
A Self-Training Approach for Short Text Clustering
- Computer ScienceRepL4NLP@ACL
- 2019
The method is proposed, which learns discriminative features from both an autoencoder and a sentence embedding, then uses assignments from a clustering algorithm as supervision to update weights of the encoder network.
Semi-supervised Clustering for Short Text via Deep Representation Learning
- Computer ScienceCoNLL
- 2016
A novel objective is designed to combine the representation learning process and the k-means clustering process together, and optimize the objective with both labeled data and unlabeled data iteratively until convergence through three steps.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
- Computer ScienceEMNLP
- 2019
Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.
An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction
- Computer ScienceEMNLP
- 2019
A new dataset is introduced that includes queries that are out-of-scope—i.e., queries that do not fall into any of the system’s supported intents, posing a new challenge because models cannot assume that every query at inference time belongs to a system-supported intent class.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Computer ScienceNAACL
- 2019
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
On the Sentence Embeddings from Pre-trained Language Models
- Computer ScienceEMNLP
- 2020
This paper proposes to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective and achieves significant performance gains over the state-of-the-art sentence embeddings on a variety of semantic textual similarity tasks.