Yongtao Ma

Learn More
Instance matching and blocking, a preprocessing step used for selecting candidate matches, require determining the most representative attributes of instances called keys, based on which similarities between instances are computed. We show that for the problem of learning blocking keys and key values, both generic techniques that do not exploit type(More)
Text-rich structured data become more and more ubiquitous on the Web and on the enterprise databases by encoding heterogeneous structural information between entities such as people, locations, or organizations and the associated textual information. For analyzing this type of data, existing topic modeling approaches, which are highly tailored toward(More)
To improve the localization accuracy in multipath environments, this paper presents an effective localization approach with the utilization of reference tags. In this approach, an improved k-nearest neighbor (k-NN) algorithm is proposed based on radio-frequency (RF) phases. The traditional k-NN algorithm only focuses on the weighting factors of the(More)
Linked Data consists of billions of RDF triples from hundreds of different sources on the Web. The effective construction and maintenance of links between these sources largely depend on data integration solutions that scale to the large volume and heterogeneity of the Linked Data Web. In this context, a promising direction is the pay-as-you-go paradigm(More)