We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces… (More)

We propose a general framework for learning from labeled and unlabeled data on a directed graph in which the structure of the graph including the directionality of the edges is considered. The time complexity of the algorithm derived from this framework is nearly linear due to recently developed numerical techniques. In the absence of labeled instances,… (More)

We usually endow the investigated objects with pairwise relationships, which can be illustrated as graphs. In many real-world problems, however, relationships among the objects of our interest are more complex than pair-wise. Naively squeezing the complex relationships into pairwise ones will inevitably lead to loss of information which can be expected… (More)

- Arthur Gretton, Alex Smola, Jiayuan Huang, Marcel Schmittfull, Karsten Borgwardt, Bernhard Schölkopf
- 2008

Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space).… (More)

We propose a technique for identifying latent Web communities based solely on the hyperlink structure of the WWW, via random walks. Although the topology of the Directed Web Graph encodes important information about the content of individual Web pages, it also reveals useful meta-level information about user communities. Random walk models are capable of… (More)

- Jiayuan Huang
- 2005

Discussions about different graph Laplacians—mainly the normalized and un-normalized versions of graph Laplacian—have been ardent with respect to various methods of clustering and graph based semi-supervised learning. Previous research in the graph Laplacians, from a continuous perspective, investigated the convergence properties of the Laplacian operators… (More)

- Rui Wang, Dong-Qin Chen, Jia-Yuan Huang, Kai Zhang, Bing Feng, Ban-Zhou Pan +3 others
- Oncotarget
- 2014

Chemoresistant tumors usually fail to respond to radiotherapy. However, the mechanisms involved in chemo- and radiotherapy cross resistance are not fully understood. Previously, we have identified microRNA (miR)-451 as a tumor suppressor in lung adenocarcinoma (LAD). However, whether miR-451 plays critical roles in chemo- and radiotherapy cross resistance… (More)

In many applications, relationships among objects of interest are more complex than pairwise. Simply approximating complex relationships as pairwise ones can lead to loss of information. An alternative for these applications is to analyze complex relationships among data directly, without the need to first represent the complex relationships into pairwise… (More)

Images and other high-dimensional data can frequently be characterized by a low dimensional manifold (e.g. one that corresponds to the degrees of freedom of the camera). Recently, nonlinear manifold learning techniques have been used to map images to points in a lower dimension space, capturing some of the dynamics of the camera or the subjects. In general,… (More)