Learn More
We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces(More)
We propose a general framework for learning from labeled and unlabeled data on a directed graph in which the structure of the graph including the directionality of the edges is considered. The time complexity of the algorithm derived from this framework is nearly linear due to recently developed numerical techniques. In the absence of labeled instances,(More)
We usually endow the investigated objects with pairwise relationships, which can be illustrated as graphs. In many real-world problems, however, relationships among the objects of our interest are more complex than pair-wise. Naively squeezing the complex relationships into pairwise ones will inevitably lead to loss of information which can be expected(More)
Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space).(More)
We propose a technique for identifying latent Web communities based solely on the hyperlink structure of the WWW, via random walks. Although the topology of the Directed Web Graph encodes important information about the content of individual Web pages, it also reveals useful meta-level information about user communities. Random walk models are capable of(More)
BACKGROUND According to the International Multidisciplinary Classification of Lung Adenocarcinoma (LAD) by International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society (IASLC/ATS/ERS) in 2011, the diagnosis of LAD is changing from simple morphology into a comprehensive multidisciplinary classification. The(More)
Hepatocellular carcinoma (HCC) is highly resistant to chemotherapy. Previously, we have shown that Aurora-A mRNA is upregulated in HCC cells or tissues and silencing of Aurora-A using small interfering RNA (siRNA) decreases growth and enhances apoptosis in HCC cells. However, the clinical significance of Aurora-A protein expression in HCC and association(More)
Real-world data often involves objects that exhibit multiple relationships; for example, 'papers' and 'authors' exhibit both paper-author interactions and paper-paper citation relationships. A typical learning problem requires one to make inferences about a subclass of objects (e.g. 'papers'), while using the remaining objects and relations to provide(More)