Pradap Konda

Learn More
Entity matching (EM) has been a long-standing challenge in data management. Most current EM works focus only on developing matching algorithms. We argue that far more efforts should be devoted to building EM systems. We discuss the limitations of current EM systems, then present as a solution Magellan, a new kind of EM systems. Magellan is novel in four(More)
Enterprise applications are analyzing ever larger amounts of data using advanced analytics techniques. Recent systems from Oracle, IBM, and SAP integrate R with a data processing system to support richer advanced analytics on large data. A key step in advanced analytics applications is feature selection, which is often an iterative process that involves(More)
Entity matching (EM) has been a long-standing challenge in data management. In the past few years we have started two major projects on EM (Magellan and Corleone/Falcon). These projects have raised many human-in-the-loop (HIL) challenges. In this paper we discuss these challenges. In particular, we show how these challenges forced us to revise our solution(More)
In this paper we argue that the data management community should devote far more effort to building data integration (DI) systems, in order to truly advance the field. Toward this goal, we make three contributions. First, we draw on our recent industrial experience to discuss the limitations of current DI systems. Second, we propose an agenda to build a new(More)
Entity matching (EM) has been a long-standing challenge in data management. Most current EM works, however, focus only on developing matching algorithms. We argue that far more efforts should be devoted to building EM systems. We discuss the limitations of current EM systems, then present Magellan, a new kind of EM systems that addresses these limitations.(More)
  • 1