Yujia Li

Learn More
When modeling structured outputs such as image seg-mentations, prediction can be improved by accurately mod-eling structure present in the labels. A key challenge is developing tractable models that are able to capture complex high level structure like shape. In this work, we study the learning of a general class of pattern-like high order potential , which(More)
For the generation of highly natural synthetic speech, the control of prosody is of primary importance. The fundamental frequency (F0) is one of the most important components of speech prosody. This research investigates the variation of F0 in continuous Cantonese speech, with the goal of establishing an effective mechanism of prosody control in Cantonese(More)
Cantonese is a major Chinese dialect with a complicated tone system. This research focuses on quantitative modeling of Cantonese tones. It uses Stem-ML, a language-independent framework for quantitative intonation modeling and generation. A set of F 0 prediction models are built, and trained on acoustic data. The prediction error is about 11 Hz or 1(More)
This paper presents a novel approach to tone recognition in continuous Cantonese speech based on overlapped di-tone Gaussian mixture models (ODGMM). The ODGMM is designed with special consideration on the fact that Cantonese tone identification relies more on the relative pitch level than on the pitch contour. A di-tone unit covers a group of two(More)
Semi-supervised learning, which uses unlabeled data to help learn a discriminative model, is especially important for structured output problems, as considerably more effort is needed to label its multi-dimensional outputs versus standard single output problems. We propose a new max-margin framework for semi-supervised structured output learning, that(More)
The mean field algorithm is a widely used approximate inference algorithm for graphical models whose exact inference is intractable. In each iteration of mean field, the approximate marginals for each variable are updated by getting information from the neighbors. This process can be equivalently converted into a feed-forward network, with each layer(More)
Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and(More)
Recently how to recommend celebrities to the public becomes an interesting problem on the social network websites, such as Twitter and Tencent Weibo. In this paper, we proposed a unified hierarchical Bayesian model to recommend celebrities to the general users. Specifically, we proposed to leverage both social network and descriptions of celebrities to(More)