• Publications
  • Influence
Neighborhood preserving embedding
This paper proposes a novel subspace learning algorithm called neighborhood preserving embedding (NPE), which aims at preserving the local neighborhood structure on the data manifold and is less sensitive to outliers than principal component analysis (PCA). Expand
Laplacian Score for Feature Selection
This paper proposes a "filter" method for feature selection which is independent of any learning algorithm, based on the observation that, in many real world classification problems, data from the same class are often close to each other. Expand
Semi-supervised Discriminant Analysis
  • Deng Cai, X. He, Jiawei Han
  • Mathematics, Computer Science
  • IEEE 11th International Conference on Computer…
  • 26 December 2007
This paper proposes a novel method, called Semi- supervised Discriminant Analysis (SDA), which makes use of both labeled and unlabeled samples to learn a discriminant function which is as smooth as possible on the data manifold. Expand
Unsupervised feature selection for multi-cluster data
Inspired from the recent developments on manifold learning and L1-regularized models for subset selection, a new approach is proposed, called Multi-Cluster Feature Selection (MCFS), for unsupervised feature selection, which select those features such that the multi-cluster structure of the data can be best preserved. Expand
Locality Sensitive Discriminant Analysis
A novel linear algorithm for discriminant analysis, called Locality Sensitive Discriminant Analysis (LSDA), which finds a projection which maximizes the margin between data points from different classes at each local area by discovering the local manifold structure. Expand
Orthogonal Laplacianfaces for Face Recognition
An appearance-based face recognition method, called orthogonal Laplacianface, based on the locality preserving projection (LPP) algorithm, which aims at finding a linear approximation to the eigenfunctions of the Laplace Beltrami operator on the face manifold. Expand
Self-taught hashing for fast similarity search
This paper proposes a novel Self-Taught Hashing (STH) approach to semantic hashing: it first finds the optimal l-bit binary codes for all documents in the given corpus via unsupervised learning, and then train l classifiers via supervised learning to predict the l- bit code for any query document unseen before. Expand
Graph Regularized Sparse Coding for Image Representation
A graph based algorithm, called graph regularized sparse coding, is proposed, to learn the sparse representations that explicitly take into account the local manifold structure of the data. Expand
VIPS: a Vision-based Page Segmentation Algorithm
An automatic top-down, tag-tree independent approach to detect web content structure that simulates how a user understands web layout structure based on his visual perception. Expand
Document clustering using locality preserving indexing
A novel document clustering method which aims to cluster the documents into different semantic classes by using locality preserving indexing (LPI), an unsupervised approximation of the supervised linear discriminant analysis (LDA) method, which gives the intuitive motivation of the method. Expand