Jason M. Reed

Learn More
Cloud enabled systems have become a crucial component to efficiently process and analyze massive amounts of data. One of the key data processing and analysis operations is the Similarity Join, which retrieves all data pairs whose distances are smaller than a predefined threshold ε. Even though multiple algorithms and implementation techniques have been(More)
An important recent technological development in computer science is the availability of highly distributed and scalable systems to process Big Data, i.e., datasets with high volume, velocity and variety. Given the extensive and effective use of systems incorporating Big Data in many application scenarios, these systems have become a key component in the(More)
The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Abstract In many machine learning application domains obtaining labeled data is expensive but obtaining(More)
  • 1