• Publications
  • Influence
Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters
TLDR
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Expand
  • 1,540
  • 114
  • PDF
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
TLDR
We develop and analyze an algorithm to compute an easily-interpretable low-rank approximation to an n x n Gram matrix G such that computations of interest may be performed more rapidly, using O(n) additional space and time, after making two passes over the data from external storage. Expand
  • 848
  • 73
  • PDF
Statistical properties of community structure in large social and information networks
TLDR
A large body of work has been devoted to identifying community structure in networks. Expand
  • 887
  • 65
  • PDF
Empirical comparison of algorithms for network community detection
TLDR
We explore a range of network community detection methods in order to compare them and to understand their relative performance and the systematic biases in the clusters they identify. Expand
  • 928
  • 61
  • PDF
CUR matrix decompositions for improved data analysis
TLDR
We present an algorithm that preferentially chooses columns and rows that exhibit high statistical leverage on the best low-rank fit of the data matrix. Expand
  • 549
  • 58
  • PDF
Relative-Error CUR Matrix Decompositions
TLDR
We propose and study matrix approximations that are explicitly expressed in terms of a small number of columns and/or rows of the data matrix, and thereby more amenable to interpretation in terms the original data. Expand
  • 377
  • 47
  • PDF
A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions
The ability of simple potential functions to reproduce accurately the density of liquid water from −37 to 100 °C at 1 to 10 000 atm has been further explored. The result is the five-site TIP5P model,Expand
  • 1,649
  • 46
  • PDF
Randomized Algorithms for Matrices and Data
TLDR
This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis. Expand
  • 661
  • 39
  • PDF
Revisiting the Nystrom Method for Improved Large-scale Machine Learning
TLDR
We reconsider randomized algorithms for the low-rank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Expand
  • 289
  • 39
  • PDF
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
TLDR
In many applications, the data consist of (or may be naturally formulated as) an $m \times n$ matrix $A$. Expand
  • 481
  • 38
  • PDF
...
1
2
3
4
5
...