Learn More
While vocal tract resonances (VTRs, or formants that are defined as such resonances) are known to play a critical role in human speech perception and in computer speech processing, there has been a lack of standard databases needed for the quantitative evaluation of automatic VTR extraction techniques. We report in this paper on our recent effort to create(More)
We present a stochastic mapping technique for robust speech recognition that uses stereo data. The idea is based on constructing a Gaussian mixture model for the joint distribution of the clean and noisy features and using this distribution to predict the clean speech during testing. The proposed mapping is called stereo-based stochastic mapping (SSM). Two(More)
A feature compensation (FC) algorithm based on polynomial regression of utterance signal-to-noise ratio (SNR) for noise robust automatic speech recognition (ASR) is proposed. In this algorithm, the bias between clean and noisy speech features is approximated by a set of polynomials which are estimated from adaptation data from the new environment by the(More)
Spoken content in languages of emerging importance needs to be searchable to provide access to the underlying information. In this paper, we investigate the problem of extending data fusion methodologies from Information Retrieval for Spoken Term Detection on low-resource languages in the framework of the IARPA Babel program. We describe a number of(More)
This paper investigates data augmentation for deep neural network acoustic modeling based on label-preserving transformations to deal with data sparsity. Two data augmentation approaches, vocal tract length perturbation (VTLP) and stochastic feature mapping (SFM), are investigated for both deep neural networks (DNNs) and convolutional neural networks(More)
Periodic stripe patterns are ubiquitous in living organisms, yet the underlying developmental processes are complex and difficult to disentangle. We describe a synthetic genetic circuit that couples cell density and motility. This system enabled programmed Escherichia coli cells to form periodic stripes of high and low cell densities sequentially and(More)
In this paper we describe the data collection for the TBALL project (Technology Based Assessment of Language and Literacy) and report the results of our efforts. We focus on aspects of our corpus that distinguish it from currently available corpora. The speakers are children (grades K-4), largely non-native speakers of English, and from diverse(More)
We report systematic optical studies of WS2 and WSe2 monolayers and multilayers. The efficiency of second harmonic generation shows a dramatic even-odd oscillation with the number of layers, consistent with the presence (absence) of inversion symmetry in even-layer (odd-layer). Photoluminescence (PL) measurements show the crossover from an indirect band gap(More)
Automatic speech recognition is a core component of many applications, including keyword search. In this paper we describe experiments on acoustic modeling, language modeling, and decoding for keyword search on a Cantonese conversational telephony corpus collected as part of the IARPA Babel program. We show that acoustic modeling techniques such as the(More)
In this paper, an MLLR-like adaptation approach is proposed whereby the transformation of the means is performed deter-ministically based on linearization of VTLN. Biases and adaptation of the variances are estimated statistically by the EM algorithm. In the discrete frequency domain, we show that under certain approximations, frequency warping with(More)