• Publications
  • Influence
ProCode: A Proactive Erasure Coding Scheme for Cloud Storage Systems
Common distributed storage systems use data replication to improve system reliability and maintain data availability, but at the cost of disk storage. In order to lower storage costs, data mayExpand
  • 12
  • 3
Selective Term Proximity Scoring Via BP-ANN
When two terms occur together in a document, the probability of a close relationship between them and the document itself is greater if they are in nearby positions. However, ranking functionsExpand
  • 5
  • 3
Being Accurate Is Not Enough: New Metrics for Disk Failure Prediction
Traditionally, disk failure prediction accuracy is used to evaluate disk failure prediction model. However, accuracy may not reflect their practical usage (protecting against failures, rather thanExpand
  • 21
  • 2
GPU MrBayes V3.1: MrBayes on Graphics Processing Units for Protein Sequence Data.
We present a modified GPU (graphics processing unit) version of MrBayes, called ta(MC)(3) (GPU MrBayes V3.1), for Bayesian phylogenetic inference on protein data sets. Our main contributions are 1)Expand
  • 10
  • 2
Hard drive failure prediction using Decision Trees
This paper proposes two hard drive failure prediction models based on Decision Trees (DTs) and Gradient Boosted Regression Trees (GBRTs) which perform well in prediction performance as well asExpand
  • 23
  • 1
Leveraging Context-Free Grammar for Efficient Inverted Index Compression
Large-scale search engines need to answer thousands of queries per second over billions of documents, which is typically done by querying a large inverted index. Many highly optimized integerExpand
  • 11
  • 1
A Latin square autotopism secret sharing scheme
We present a novel secret sharing scheme where the secret is an autotopism (a symmetry) of a Latin square. Previously proposed secret sharing schemes involving Latin squares have many drawbacks: (a)Expand
  • 12
  • 1
Lazy exact deduplication
During data deduplication, on-disk fingerprint lookups lead to high disk traffic, resulting in a bottleneck. In this paper, we propose a “lazy” data deduplication method which buffers incomingExpand
  • 11
  • 1
Computing the number of h-edge spanning forests in complete bipartite graphs
  • R. Stones
  • Computer Science, Mathematics
  • Discret. Math. Theor. Comput. Sci.
  • 5 December 2014
Let fm,n,h be the number of spanning forests with h edges in the complete bipartite graph Km,n. Kirchhoff\textquoterights Matrix Tree Theorem implies fm,n,m+n-1=mn-1 nm-1 when m ≥1 and n ≥1, sinceExpand
  • 4
  • 1
Efficient GPU-Based Query Processing with Pruned List Caching in Search Engines
There are two inherent obstacles to effectively using Graphics Processing Units (GPUs) for query processing in search engines: (a) the highly restricted GPU memory space, and (b) the CPU-GPU transferExpand
  • 1
  • 1