Learn More
Nearest neighbor graphs are widely used in data mining and machine learning. A brute-force method to compute the exact kNN graph takes Θ(dn 2) time for n data points in the d dimensional Euclidean space. We propose two divide and conquer methods for computing an approximate kNN graph in Θ(dn t) time for high dimensional data (large d). The exponent t ∈ (1,(More)
The identification of genes in biomedi-cal text typically consists of two stages: identifying gene mentions and normaliza-tion of gene names. We have created an automated process that takes the output of named entity recognition (NER) systems designed to identify genes and normalizes them to standard referents. The system identifies human gene synonyms from(More)
Many applications in science and engineering lead to models which require solving large-scale fixed point problems, or equivalently, systems of nonlinear equations. Several successful techniques for handling such problems are based on quasi-Newton methods that implicitly update the approximate Jacobian or inverse Jacobian to satisfy a certain secant(More)
Given an n × n symmetric possibly indefinite matrix A, a modified Cholesky algorithm computes a factorization of the positive definite matrix A + E, where E is a correction matrix. Since the factorization is often used to compute a Newton-like downhill search direction for an optimization problem, the goals are to compute the modification without much(More)
When combined with Krylov projection methods, polynomial filtering can provide a powerful method for extracting extreme or interior eigenvalues of large sparse matrices. This general approach can be quite efficient in the situation when a large number of eigenvalues is sought. However, its competitiveness depends critically on a good implementation. This(More)
In the past decade, a number of nonlinear dimensionality reduction methods using an affinity graph have been developed for manifold learning. This paper explores a multilevel framework with the goal of reducing the cost of unsupervised manifold learning and preserving the embedding quality at the same time. An application to spectral clustering is also(More)
Two multilevel frameworks for manifold learning algorithms are discussed which are based on an affinity graph whose goal is to sketch the neighborhood of each sample point. One framework is geometric and is suitable for methods aiming to find an isometric or a conformal mapping, such as isometric feature mapping (Isomap) and semidefinite embedding (SDE).(More)
Dimension reduction techniques have been successfully applied to face recognition and text information retrieval. The process can be time-consuming when the data set is large. This paper presents a multilevel framework to reduce the size of the data set, prior to performing dimension reduction. The algorithm exploits nearest-neighbor graphs. It recursively(More)
We call a matrix triadic if it has no more than two nonzero off-diagonal elements in any column. A symmetric tridiagonal matrix is a special case. In this paper we consider LXL T factorizations of symmetric triadic matrices, where L is unit lower triangular and X is diagonal, block diagonal with 1×1 and 2×2 blocks, or the identity with L lower triangular.(More)