Corpus ID: 237260226

Development of a Conversation State Prediction System

  title={Development of a Conversation State Prediction System},
  author={Sujay Uday Rittikar},
With the evolution of the concept of Speaker diarization using LSTM, it’s relatively easier to understand the speaker identities for specific segments of input audio stream data than manually tagging the data. With such a concept, it’s highly desirable to consider the possibility of using the identified speaker identities to aid in predicting the future Speaker States in a conversation. In this study, the Markov Chains are used to identify and update the Speaker States for the next… Expand

Figures and Tables from this paper


Speaker Diarization with LSTM
This work combines LSTM-based d-vector audio embeddings with recent work in nonparametric clustering to obtain a state-of-the-art speaker diarization system that achieves a 12.0% diarization error rate on NIST SRE 2000 CALLHOME, while the model is trained with out- of-domain data from voice search logs. Expand
An HMM approach to text-prompted speaker verification
  • C. Che, Q. Lin, D. Yuk
  • Computer Science
  • 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings
  • 1996
A speaker recognition system based on hidden Markov models (HMM) that utilizes concatenated phoneme HMMs and works in a text-prompted mode and studies effects of various factors (such as the mixture number and cohort selection) on the performance of speaker recognition. Expand
Deep neural networks for small footprint text-dependent speaker verification
Experimental results show the DNN based speaker verification system achieves good performance compared to a popular i-vector system on a small footprint text-dependent speaker verification task and is more robust to additive noise and outperforms the i- vector system at low False Rejection operating points. Expand
Efficient Dialogue State Tracking by Selectively Overwriting Memory
The accuracy gaps between the current and the ground truth-given situations are analyzed and it is suggested that it is a promising direction to improve state operation prediction to boost the DST performance. Expand
An improved i-vector extraction algorithm for speaker verification
A new i-vector extraction algorithm from the total factor matrix is proposed which is term component reduction analysis (CRA) which contributes to better modelling of session variability in the total factors space. Expand
Conversational AI: An Overview of Methodologies, Applications & Future Scope
This study is intended to shed light on the latest research in Conversational AI architecture development and also to highlight the improvements that these novel innovations have achieved over their traditional counterparts. Expand
Generalized End-to-End Loss for Speaker Verification
A new loss function called generalized end-to-end (GE2E) loss is proposed, which makes the training of speaker verification models more efficient than the previous tuple-based end- to- end (TE2e) loss function. Expand
Speech Segmentation and its Impact on Spoken Document Processing
Progress in both speech and language processing has spurred efforts to support applications that rely on spoken—rather than written—language input. A key challenge in moving from text-based documentsExpand
DIET: Lightweight Language Understanding for Dialogue Systems
Large-scale pre-trained language models have shown impressive results on language understanding benchmarks like GLUE and SuperGLUE, improving considerably over other pre-training methods likeExpand
i-Vectors in speech processing applications: a survey
This survey presents a comprehensive collection of research work related to i-vectors since its inception and discusses some recent trends of using i-VEctors in combination with other approaches. Expand