Learning First-Order Horn Clauses from Web Text
Even the entire Web corpus does not explicitly answer all questions, yet inference can uncover many implicit answers. But where do inference rules come from? This paper investigates the problem of learning inference rules from Web text in an un-supervised, domain-independent manner. The SHERLOCK system, described herein, is a first-order learner that acquires over 30,000 Horn clauses from Web text. SHERLOCK embodies several innovations, including a novel rule scoring function based on Statistical Relevance (Salmon et al., 1971) which is effective on ambiguous, noisy and incomplete Web extractions. Our experiments show that inference over the learned rules discovers three times as many facts (at precision 0.8) as the TEXTRUNNER system which merely extracts facts explicitly stated in Web text.