Sungrack Yun

Learn More
This paper considers a method for speech emotion recognition by a max-margin framework incorporating a loss function based on a well-known model called theWatson and Tellegen's emotion model. Each emotion is modeled by a single-state hidden Markov model (HMM) that is trained by maximizing the minimum separation margin between emotions, and the margin is(More)
This paper considers a large margin discriminative semi-Markov model (LMSMM) for phonetic recognition. The hidden Markov model (HMM) framework that is often used for phonetic recognition assumes only local statistical dependencies between adjacent observations, and it is used to predict a label for each observation without explicit phone segmentation. On(More)
This paper considers a learning framework for speech emotion classification using a discriminant function based on Gaussian mixture models (GMMs). The GMM parameter set is estimated by margin scaling with a loss function to reduce the risk of predicting emotions with high loss. Here, the loss function is defined as a function of a distance metric using the(More)
For phoneme classification, this paper describes an acoustic model based on the variational Gaussian process dynamical system (VGPDS). The nonlinear and nonparametric acoustic model is adopted to overcome the limitations of classical hidden Markov models (HMMs) in modeling speech. The Gaussian process prior on the dynamics and emission functions(More)
This paper considers a supervised image segmentation algorithm based on joint-kernelized structured prediction. In the proposed algorithm, correlation clustering over a superpixel graph is conducted using a non-linear discriminant function, where the parameters are learned by a kernelized-structured support vector machine (SSVM). For an input superpixel(More)
This paper describes an algorithm to control the expressed emotion of a synthesized song. Based on the database of various melodies sung neutrally with restricted set of words, hidden semi-Markov models (HSMMs) of notes ranging from E3 to G5 are constructed for synthesizing singing voice. Three steps are taken in the synthesis: (1) Pitch and duration are(More)
This paper considers a large margin training of semi-Markov model (SMM) for phonetic recognition. The SMM framework is better suited for phonetic recognition than the hidden Markov model (HMM) framework in that the SMM framework is capable of simultaneously segmenting the uttered speech into phones and labeling the segment-based features. In this paper, the(More)
This paper studies a method for learning a discriminative visual codebook for various computer vision tasks such as image categorization and object recognition. The performance of various computer vision tasks depends on the construction of the codebook which is a table of visual-words (i.e. codewords). This paper proposed a learning criterion for(More)
This paper considers a υ-structured support vector machine (υ-SSVM) which is a structured support vector machine (SSVM) incorporating an intuitive balance parameter υ. In the absence of the parameter υ, cumbersome validation would be required in choosing the balance parameter. We theoretically prove that the parameter υ(More)