Dhananjay Ram

Learn More
Sparse representation has been shown to be a powerful mod-eling framework for classification and detection tasks. In this paper, we propose a new keyword detection algorithm based on sparse representation of the posterior exemplars. The posterior exemplars are phone conditional probabilities obtained from a deep neural network. This method relies on the(More)
We cast the problem of query by example spoken term detection (QbE-STD) as subspace detection where query and background are modeled as a union of low-dimensional subspaces. The speech ex-emplars used for subspace modeling consist of class-conditional posterior probabilities obtained from deep neural network (DNN). The query and background training(More)
We cast the query by example spoken term detection (QbE-STD) problem as subspace detection where query and background subspaces are modeled as union of low-dimensional subspaces. The speech exemplars used for subspace model-ing are class-conditional posterior probabilities estimated using deep neural network (DNN). The query and background training(More)
  • Lic Claudia, C Russo, Lic Hugo, D Ram, Eng Armando De Giusti
  • 1998
In this paper two images compression techniques with loss based on the method used by the compression Standard Joint Photographic Experts Groups (JPEG) and on the Quadtree adaptive partitioning method are presented and compared, and the loss produced studied. The ba seline JPEG algorithm will be ana lyzed, and op timizations emphasizing quantification and(More)
We cast the query by example spoken term detection (QbE-STD) problem as subspace detection where query and background subspaces are modeled as union of low-dimensional subspaces. The speech exemplars used for subspace model-ing are class-conditional posterior probabilities estimated using deep neural network (DNN). The query and background training(More)
State of the art query by example spoken term detection (QbE-STD) systems rely on representation of speech in terms of sequences of class-conditional posterior probabilities estimated by deep neural network (DNN). The posteriors are often used for pattern matching or dynamic time warping (DTW). Exploiting posterior probabilities as speech representation(More)
In this work, a Bayesian approach to speaker normalization is proposed to compensate for the degradation in performance of a speaker independent speech recognition system. The speaker normalization method proposed herein uses the technique of vocal tract length normalization (VTLN). The VTLN parameters are estimated using a novel Bayesian approach which(More)
This work demonstrates an application of different real-time speech technologies, exploited in an online gaming scenario. The game developed for this purpose is inspired by the famous television based quiz-game show, " Who wants to be a millionaire " , in which multiple-choice questions of increasing difficulty are asked to the participant. Text-to-speech(More)
  • 1