Learn More
In this study, we explore the propagation of uncertainty in the state-of-the-art speaker recognition system. Specifically, we incorporate the uncertainty associated with observation features into the i-Vector extraction framework. To prove the concept, both the oracle and practically estimated uncertainty are used for evaluation. The oracle uncertainty is(More)
This paper advances the design of CTC-based all-neural (or end-to-end) speech recognizers. We propose a novel symbol inventory, and a novel iterated-CTC method in which a second system is used to transform a noisy initial output into a cleaner version. We present a number of stabilization and initialization methods we have found useful in training these(More)
In this study, we describe the systems developed by the Center for Robust Speech Systems (CRSS), Univ. of Texas-Dallas, for the NIST i-vector challenge. Given the emphasis of this challenge is on utilizing unlabeled development data, our system development focuses on: 1) leveraging the channel variation from unlabeled development data through unsupervised(More)
NASA's Apollo program stands as one of mankind's greatest achievements in the 20th century. During a span of 4 years (from 1968 to 1972), a total of 9 lunar missions were launched and 12 astronauts walked on the surface of the moon. It was one the most complex operations executed from scientific, technological and operational perspectives. In this paper, we(More)
Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two(More)
In state-of-the-art speaker recognition system, universal background model (UBM) plays a role of acoustic space division. Each Gaussian mixture of trained UBM represents one distinct acoustic region. The posterior probabilities of features belonging to each region are further used as core components of Baum-Welch statistics. Therefore, the quality of(More)
Mask-based objective speech-intelligibility measures have been successfully proposed for evaluating the performance of binary masking algorithms. These objective measures were computed directly by comparing the estimated binary mask against the ground truth ideal binary mask (IdBM). Most of these objective measures, however, assign equal weight to all(More)
State-of-the-art speaker verification systems model speaker identity by mapping i-Vectors onto a probabilistic linear discriminant analysis (PLDA) space. Compared to other modeling approaches (such as cosine distance scoring), PLDA provides a more efficient mechanism to separate speaker information from other sources of undesired variabilities and offers(More)