Identifying Relations for Open Information Extraction

Abstract

Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state-of-the-art Open IE systems is rife with uninfor-mative and incoherent extractions. To overcome these problems, we introduce two simple syntactic and lexical constraints on binary relations expressed by verbs. We implemented the constraints in the REVERB Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TEXTRUNNER and WOE pos. More than 30% of REVERB's extractions are at precision 0.8 or higher— compared to virtually none for earlier systems. The paper concludes with a detailed analysis of REVERB's errors, suggesting directions for future work.

Extracted Key Phrases

Showing 1-10 of 419 extracted citations
050100150201220132014201520162017
Citations per Year

636 Citations

Semantic Scholar estimates that this publication has received between 550 and 740 citations based on the available data.

See our FAQ for additional information.