Learn More
We address the problem of estimating the ratio of two probability density functions, which is often referred to as the importance. The importance values can be used for various succeeding tasks such as covariate shift adaptation or outlier detection. In this paper, we propose a new importance estimation method that has a closed-form solution; the(More)
We aim at an extension of AdaBoost to U-Boost, in the paradigm to build a stronger classification machine from a set of weak learning machines. A geometric understanding of the Bregman divergence defined by a generic convex function U leads to the U-Boost method in the framework of information geometry extended to the space of the finite measures over a(More)
In statistical pattern recognition, it is important to avoid density estimation since density estimation is often more difficult than pattern recognition itself. Following this idea—known as Vapnik's principle, a statistical data processing framework that employs the ratio of two probability density functions has been developed recently and is gathering a(More)
Divergence estimators based on direct approximation of density ratios without going through separate approximation of numerator and denominator densities have been successfully applied to machine learning tasks that involve distribution comparison such as outlier detection, transfer learning, and two-sample homogeneity test. However, since density-ratio(More)
BACKGROUND Although microarray gene expression analysis has become popular, it remains difficult to interpret the biological changes caused by stimuli or variation of conditions. Clustering of genes and associating each group with biological functions are often used methods. However, such methods only detect partial changes within cell processes. Herein, we(More)
We propose a new statistical approach to the problem of inlier-based outlier detection , i.e., finding outliers in the test set based on the training set consisting only of inliers. Our key idea is to use the ratio of training and test data densities as an outlier score. This approach is expected to have better performance even in high-dimensional problems(More)
Mutual information is useful in various data processing tasks such as feature selection or independent component analysis. In this paper, we propose a new method of approximating mutual information based on maximum likelihood estimation of a density ratio function. Our method, called Maximum Likelihood Mutual Information (MLMI), has several attractive(More)
We address the problem of estimating the ratio of two probability density functions (a.k.a. the importance). The importance values can be used for various succeeding tasks such as non-stationarity adaptation or outlier detection. In this paper, we propose a new importance estimation method that has a closed-form solution; the leave-one-out cross-validation(More)
In this study, the computational properties of a kernel-based least-squares density-ratio estimator are investigated from the viewpoint of condition numbers. The condition number of the Hessian matrix of the loss function is closely related to the convergence rate of optimization and the numerical stability. We use smoothed analysis techniques and(More)
The goal of regression analysis is to describe the stochastic relationship between an input vector x and a scalar output y. This can be achieved by estimating the entire conditional density p(y / x). In this letter, we present a new approach for nonparametric conditional density estimation. We develop a piecewise-linear path-following method for(More)