Decision-Making Bias in Instance Matching Model Selection

Abstract

Instance matching has emerged as an important problem in the Semantic Web, with machine learning methods proving especially effective. To enhance performance, task-specific knowledge is typically used to introduce bias in the model selection problem. Such biases tend to be exploited by practitioners in a piecemeal fashion. This paper introduces a framework where the model selection design process is represented as a factor graph. Nodes in this bipartite graphical model represent opportunities for explicitly introducing bias. The graph is first used to unify and visualize common biases in the design of existing instance matchers. As a direct application, we then use the graph to hypothesize about potential unexploited biases. The hypotheses are evaluated by training 1032 neural networks on three instance matching tasks on Microsoft Azure’s cloud-based platform. An analysis over 25 GB of experimental data indicates that the proposed biases can improve efficiency by over 65% over a baseline configuration, with effectiveness improving by a smaller margin. The findings lead to a promising set of four recommendations that can be integrated into existing supervised instance matchers.

DOI: 10.1007/978-3-319-25007-6_23

3 Figures and Tables

Cite this paper

@inproceedings{Kejriwal2015DecisionMakingBI, title={Decision-Making Bias in Instance Matching Model Selection}, author={Mayank Kejriwal and Daniel P. Miranker}, booktitle={International Semantic Web Conference}, year={2015} }