Learn More
Algorithms for computing similarity joins in MapReduce were offered in [2]. Similarity joins ask to find input pairs that are within a certain distance d according to some distance measure. Here we explore the " anchor-points algorithm " of [2]. We continue looking at Hamming distance, and show that the method of that paper can be improved; in particular,(More)
Many approaches fashionable with technically-oriented practitioners clearly fail to satisfy the need for clarity of requirements. Some even trade short-term acceleration for long-term maintenance & support costs. What is missing? What can be done to ensure that new technologies help rather than hinder? This paper suggests some simple process improvements(More)
  • 1