A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes.Expand

We develop and analyze an algorithm to compute an easily-interpretable low-rank approximation to an n x n Gram matrix G such that computations of interest may be performed more rapidly, using O(n) additional space and time, after making two passes over the data from external storage.Expand

We explore a range of network community detection methods in order to compare them and to understand their relative performance and the systematic biases in the clusters they identify.Expand

We present an algorithm that preferentially chooses columns and rows that exhibit high statistical leverage on the best low-rank fit of the data matrix.Expand

We propose and study matrix approximations that are explicitly expressed in terms of a small number of columns and/or rows of the data matrix, and thereby more amenable to interpretation in terms the original data.Expand

The ability of simple potential functions to reproduce accurately the density of liquid water from −37 to 100 °C at 1 to 10 000 atm has been further explored. The result is the five-site TIP5P model,… Expand

This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis.Expand

We reconsider randomized algorithms for the low-rank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications.Expand