Xiaoxiao Shi

Learn More
Labeled examples are often expensive and time-consuming to obtain. One practically important problem is: can the labeled data from other related sources help predict the target task, even if they have (a) different feature spaces (e.g., image vs. text data), (b) different data distributions, and (c) different output spaces? This paper proposes a solution(More)
Collective classification in relational data has become an important and active research topic in the last decade. It exploits the dependencies of instances in a network to improve predictions. Related applications include hyperlinked document classification, social network analysis and collaboration network analysis. Most of the traditional collective(More)
When labeled examples are not readily available, active learning and transfer learning are separate efforts to obtain labeled examples for inductive learning. Active learning asks domain experts to label a small set of examples, but there is a cost incurred for each answer. While transfer learning could borrow labeled examples from a different domain(More)
Co-clustering was proposed to simultaneously cluster objects and features to explore inter-correlated patterns. For example, by analyzing the blog click-through data, one finds the group of users who are interested in a specific group of blogs in order to perform applications such as recommendations. However, it is usually very difficult to achieve good(More)
Sample selection bias is a common problem in many real world applications, where training data are obtained under realistic constraints that make them follow a different distribution from the future testing data. For example, in the application of hospital clinical studies, it is common practice to build models from the eligible volunteers as the training(More)
Most existing transfer learning techniques are limited to problems of knowledge transfer across tasks sharing the same set of class labels. In this paper, however, we relax this constraint and propose a spectral-based solution that aims at unveiling the intrinsic structure of the data and generating a partition of the target data, by transferring the(More)
In many applications, it is very expensive or time consuming to obtain a lot of labeled examples. One practically important problem is: can the labeled data from other related sources help predict the target task, even if they have 1) different feature spaces (e.g., image versus text data), 2) different data distributions, and 3) different output spaces?(More)
Multiple data sources containing different types of features may be available for a given task. For instance, users’ profiles can be used to build recommendation systems. In addition, a model can also use users’ historical behaviors and social networks to infer users’ interests on related products. We argue that it is desirable to collectively use any(More)
In recent years, compressive sensing attracts intensive attentions in the field of statistics, automatic control, data mining and machine learning. It assumes the sparsity of the dataset and proposes that the whole dataset can be reconstructed by just observing a small set of samples. One of the important approaches of compressive sensing is trace norm(More)