Learn More
Given a network, intuitively two nodes belong to the same role if they have similar structural behavior. Roles should be automatically determined from the data, and could be, for example, " clique-members, " " periphery-nodes, " etc. Roles enable numerous novel and useful network mining tasks, such as sense-making, searching for similar nodes, and node(More)
The low-density lipoprotein receptor mediates cholesterol homeostasis through endocytosis of lipoproteins. It discharges its ligand in the endosome at pH < 6. In the crystal structure at pH = 5.3, the ligand-binding domain (modules R2 to R7) folds back as an arc over the epidermal growth factor precursor homology domain (the modules A, B, beta propeller,(More)
This paper introduces <i>LDA-G</i>, a scalable Bayesian approach to finding latent group structures in large real-world graph data. Existing Bayesian approaches for group discovery (such as <i>Infinite Relational Models</i>) have only been applied to small graphs with a couple of hundred nodes. LDA-G (short for <i>Latent Dirichlet Allocation for Graphs</i>)(More)
Given a large time-evolving graph, how can we model and characterize the temporal behaviors of individual nodes (and network states)? How can we model the behavioral transition patterns of nodes? We propose a temporal behavior model that captures the "roles" of nodes in the graph and how they evolve over time. The proposed dynamic behavioral(More)
Given a graph, how can we extract good features for the nodes? For example, given two large graphs from the same domain, how can we use information in one to do classification in the other (i.e., perform across-network classification or transfer learning on graphs)? Also, if one of the graphs is anonymized, how can we use information in one to de-anonymize(More)
We introduce a novel Bayesian framework for hybrid community discovery in graphs. Our framework, HCDF (short for Hybrid Community Discovery Framework), can effectively incorporate hints from a number of other community detection algorithms and produce results that outperform the constituent parts. We describe two HCDF-based approaches which are: (1)(More)
To understand the structural dynamics of a large-scale social, biological or technological network, it may be useful to discover behavioral roles representing the main connectivity patterns present over time. In this paper, we propose a scalable non-parametric approach to automatically learn the structural dynamics of the network and individual nodes. Roles(More)
Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security(More)
The Top-K problem is defined as follows. Given L lists of real numbers, find the top K scoring L-tuples. A tuple is scored by the sum of its components. Rare event modeling and event ranking are often reduced to the Top-K problem. In this paper, we present the application of a fixed-memory heuristic search algorithm (namely, SMA*) and its distributed-memory(More)