Unsupervised Topic Segmentation of Meetings with BERT Embeddings
@article{Solbiati2021UnsupervisedTS, title={Unsupervised Topic Segmentation of Meetings with BERT Embeddings}, author={Alessandro Solbiati and Kevin Heffernan and Georgios Damaskinos and Shivani Poddar and Shubham Modi and Jacques Cal{\`i}}, journal={ArXiv}, year={2021}, volume={abs/2106.12978} }
Topic segmentation of meetings is the task of dividing multi-person meeting transcripts into topic blocks. Supervised approaches to the problem have proven intractable due to the difficulties in collecting and accurately annotating large datasets. In this paper we show how previous unsupervised topic segmentation methods can be improved using pre-trained neural architectures. We introduce an unsupervised approach based on BERT embeddings that achieves a 15.5% reduction in error rate over…
3 Citations
Topic Break Detection in Interview Dialogues Using Sentence Embedding of Utterance and Speech Intention Based on Multitask Neural Networks
- Computer ScienceSensors
- 2022
A method for detecting topic breaks in dialogue to achieve flexible topic switching in interview dialogue systems is proposed based on multi-task learning neural network that uses embedded representations of sentences to understand the context of the text and utilizes the intention of an utterance as a feature.
PREME: Preference-based Meeting Exploration through an Interactive Questionnaire
- Computer ScienceArXiv
- 2022
This work proposes a novel end-to-end framework for generating interactive questionnaires for preference-based meeting exploration and introduces an automatic evaluation strategy that measures how much the generated questions via questionnaire are answerable to ensure factual correctness.
When headers are not there: design and user evaluation of an automatic topicalisation and labelling tool to aid the exploration of web documents by blind users
- Computer ScienceW4A
- 2022
The design and evaluation of a tool for automatically generating headers for screen readers with topicalisation and labelling algorithms, which uses Natural Language Processing techniques to divide a web document into topic segments and label each segment based on its content.
References
SHOWING 1-10 OF 36 REFERENCES
Topic segmentation in ASR transcripts using bidirectional RNNS for change detection
- Computer Science2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- 2017
A novel approach for topic segmentation in speech recognition transcripts by measuring lexical cohesion using bidirectional Recurrent Neural Networks (RNNs) to perform topic change detection.
SECTOR: A Neural Model for Coherent Topic Segmentation and Classification
- Computer ScienceTACL
- 2019
SECTOR, a model to support machine reading systems by segmenting documents into coherent sections and assigning topic labels to each section, and reports a highest score of 71.6% F1 for the segmentation and classification of 30 topics from the English city domain.
Attention-Based Neural Text Segmentation
- Computer ScienceECIR
- 2018
This paper proposes an attention-based bidirectional LSTM model where sentence embeddings are learned using CNNs and the segments are predicted based on contextual information that can automatically handle variable sized context information.
Discourse Segmentation of Multi-Party Conversation
- Computer ScienceACL
- 2003
A domain-independent topic segmentation algorithm for multi-party speech that combines knowledge about content using a text-based algorithm as a feature and about form using linguistic and acoustic cues about topic shifts extracted from speech.
SegBot: A Generic Neural Text Segmentation Model with Pointer Network
- Computer ScienceIJCAI
- 2018
This work proposes a generic end-to-end segmentation model called SegBot, which outperforms state-of-the-art models on both topic and EDU segmentation tasks.
Statistical Models for Text Segmentation
- Computer ScienceMachine Learning
- 2004
Assessment of the approach on quantitative and qualitative grounds demonstrates its effectiveness in two very different domains, Wall Street Journal news articles and television broadcast news story transcripts, using a new probabilistically motivated error metric.
Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books
- Computer Science2015 IEEE International Conference on Computer Vision (ICCV)
- 2015
To align movies and books, a neural sentence embedding that is trained in an unsupervised way from a large corpus of books, as well as a video-text neural embedding for computing similarities between movie clips and sentences in the book are proposed.
Linear Text Segmentation Using Affinity Propagation
- Computer ScienceEMNLP
- 2011
The results suggest that APS performs on par with or outperforms these two very competitive baselines on topical text segmentation in comparison with two state-of-the art segmenters.
Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages
- Computer ScienceCL
- 1997
The algorithm is fully implemented and is shown to produce segmentation that corresponds well to human judgments of the subtopic boundaries of 12 texts, which should be useful for many text analysis tasks, including information retrieval and summarization.
How Diachronic Text Corpora Affect Context based Retrieval of OOV Proper Names for Audio News
- Computer ScienceLREC
- 2016
It is concluded that a diachronic corpus with text from different sources leads to better retrieval performance than one relying on text from single source or from a longer time span.