Tejas Godambe

Learn More
Speech is the most prominent and natural form of communication between humans. Human beings have long been motivated to create computer that can understand and talk like human. When the research tries to develop certain recognition system they require certain previously stored data i. e. database for respective recognition system. There are various speech(More)
In this paper, we investigate two different recurrent neural network (RNN) architectures: Elman RNN and recently proposed clockwork RNN [1] for statistical parametric speech synthesis (SPSS). Of late, deep neural networks are being used for SPSS which involve predicting every frame independent of the previous predictions, and hence requires post-processing(More)
This paper details the experiments conducted to train an as good performing Vietnamese speech recognition system as possible using public domain data only, as a part of the Zero Cost task at MediEval 2016. We explored techniques related to audio preprocessing, use of speaker’s pitch information, data perturbation, for building subspace Gaussian mixture(More)
  • 1