Semi-supervised Clustering for Short Text via Deep Representation Learning

  title={Semi-supervised Clustering for Short Text via Deep Representation Learning},
  author={Zhiguo Wang and Haitao Mi and Abraham Ittycheriah},
  booktitle={Conference on Computational Natural Language Learning},
In this work, we propose a semi-supervised method for short text clustering, where we represent texts as distributed vectors with neural networks, and use a small amount of labeled data to specify our intention for clustering. [] Key Method We design a novel objective to combine the representation learning process and the k-means clustering process together, and optimize the objective with both labeled data and unlabeled data iteratively until convergence through three steps: (1) assign each short text to…

Figures and Tables from this paper

Deep Feature-Based Text Clustering and its Explanation

This model, which is based on sequence representations, breaks the dependency on supervision and outperforms classic text clustering algorithms and the state-of-the-art pretrained language model, i.e., BERT, on almost all the considered datasets.

Text Clustering on Short Message by Using Deep Semantic Representation

This work proposes an algorithm to extract semantic and multidimensional feature representation from short texts by using the fact that comments are semantically related to the short message, and uses a convolutional-pooling structure that aims at mapping the text into a semantic representation.

CL-Aff Deep semisupervised clustering

A semi-supervised neural architecture for muti-label settings, that combines deep learning representation and k-means clustering is introduced, that can leverage large-scale unlabeled data and achieve better results compared to baseline unsupervised as well as supervised methods.

k-Nearest Neighbor Augmented Neural Networks for Text Classification

This work proposes to enhance neural network models by allowing them to leverage information from $k-nearest neighbor (kNN) of the input text by employing a neural network that encodes texts into text embeddings and utilizing it to capture instance-level information from the training set.

Improve Document Embedding for Text Categorization Through Deep Siamese Neural Network

The results show that the proposed representations outperform the conventional and state-of-the-art representations in the text classification task on this dataset.

A Simple and Effective Usage of Self-supervised Contrastive Learning for Text Clustering

Based on bidirectional encoder representations from transformers, self-supervised contrastive learning as well as few-shot Contrastive learning (FCL) with unsupervised data augmentation (UDA) for text clustering succeeds and improves the performance for short texts.

Fine-Tuning Language Models For Semi-Supervised Text Mining

This paper evaluates two clustering algorithms using the output of three different language models on six real-world text mining tasks to demonstrate to what extent this pipeline can improve text clustering accuracy and the amount of labeled samples needed for improvement.

A Semi-Supervised Deep Clustering Pipeline for Mining Intentions From Texts

Verint Intent Manager (VIM), an analysis platform that combines unsupervised and semi-supervised approaches to help data analysts quickly surface and organize relevant user intentions from conversational texts, produces high quality results, improving the performance of data analysts and reducing the time it takes to surface intentions from customer service data.

Dialog Intent Induction via Density-based Deep Clustering Ensemble

Compared to existing K-means based methods, the Density-based Deep Clustering Ensemble method for dialog intent induction is more effective in dealing with real-life scenarios where a large number of outliers exist.

Cluster & Tune: Boost Cold Start Performance in Text Classification

This work suggests a method to boost the performance of pre-trained models by adding an intermediate unsupervised classification task, between the pre-training and fine-tuning phases, and shows that this additional classification phase can significantly improve performance, mainly for topical classification tasks, when the number of labeled instances available for fine- tuning is only a couple of dozen to a few hundred.



Supervised Sequence Labelling with Recurrent Neural Networks

  • A. Graves
  • Computer Science
    Studies in Computational Intelligence
  • 2008
A new type of output layer that allows recurrent networks to be trained directly for sequence labelling tasks where the alignment between the inputs and the labels is unknown, and an extension of the long short-term memory network architecture to multidimensional data, such as images and video sequences.

Text Understanding from Scratch

It is shown that temporal ConvNets can achieve astonishing performance without the knowledge of words, phrases, sentences and any other syntactic or semantic structures with regards to a human language.

A comparison of extrinsic clustering evaluation metrics based on formal constraints

This article defines a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families, and proposes a modified version of Bcubed that avoids the problems found with other metrics.

Convolutional Neural Networks for Sentence Classification

The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.

Semi‐supervised clustering methods

  • E. Bair
  • Computer Science
    Wiley interdisciplinary reviews. Computational statistics
  • 2013
Several clustering algorithms that can be applied in many situations to identify clusters that are associated with a particular outcome variable, including document processing and modern genetics are described.

Efficient Estimation of Word Representations in Vector Space

Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.

On ontology-driven document clustering using core semantic features

It is shown that an ontology can be used to greatly reduce the number of features needed to do document clustering and that by using core semantic features for clustering, one can reduce thenumber of features by 90% or more and still produce clusters that capture the main themes in a text corpus.

Distance Metric Learning for Large Margin Nearest Neighbor Classification

This paper shows how to learn a Mahalanobis distance metric for kNN classification from labeled examples in a globally integrated manner and finds that metrics trained in this way lead to significant improvements in kNN Classification.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Reducing the Dimensionality of Data with Neural Networks

This work describes an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.