• Publications
  • Influence
Trends in big data analytics
TLDR
We describe commonly used hardware platforms for executing analytics applications, and associated considerations of processing, processing, networking, and energy. Expand
  • 554
  • 23
  • PDF
Pairwise Alignment of Protein Interaction Networks
TLDR
We propose a mathematical model that extends the concepts of match, mismatch, and gap in sequence alignment and evaluates similarity between graph structures through a scoring function that accounts for evolutionary events. Expand
  • 269
  • 22
  • PDF
Isoefficiency: measuring the scalability of parallel algorithms and architectures
TLDR
Isoefficiency analysis helps us determine the best algorithm/architecture combination for a particular problem without explicitly analyzing all possible combinations under all possible conditions. Expand
  • 345
  • 18
  • PDF
The MG-RAST metagenomics database and portal in 2015
TLDR
We present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignment tools. Expand
  • 120
  • 14
  • PDF
Redundant reader elimination in RFID systems
TLDR
We prove that an optimal solution to the redundant reader problem is NP-hard and propose a randomized, distributed, and localized approximation algorithm, RRE. Expand
  • 152
  • 12
  • PDF
A secure protocol for computing dot-products in clustered and distributed environments
TLDR
We present an extremely efficient and sufficiently secure protocol for computing the dot-product of two vectors using linear algebraic techniques. Expand
  • 139
  • 11
  • PDF
An efficient algorithm for detecting frequent subgraphs in biological networks
TLDR
We propose an efficient algorithm that can extract frequently occurring patterns in metabolic pathways extracted from the KEGG database within seconds. Expand
  • 204
  • 10
  • PDF
Scalable Load Balancing Techniques for Parallel Computers
TLDR
We analyze the scalability of a number of load balancing algorithms which can be applied to problems that have the following characteristics: the work done by a processor can be partitioned into independent work pieces; the work pieces are of highly variable sizes; and it is not possible (or very difficult) to estimate the size of total work at a given processor. Expand
  • 251
  • 9
  • PDF
Privacy Risks in Recommender Systems
TLDR
Recommender system users who rate items across disjoint domains face a privacy risk analogous to the one that occurs with statistical database queries. Expand
  • 191
  • 9
  • PDF
Search with probabilistic guarantees in unstructured peer-to-peer networks
TLDR
In this paper, the authors presented a simple but highly effective protocol for object location that gives probabilistic guarantees of finding even rare objects independently of the network topology. Expand
  • 92
  • 9
  • PDF