Leveraging Giant Text Corpora to Enhance the Coverage of Pattern-based Information Extraction Systems

  • Published 2008


Pattern-based approaches for Information Extraction typically apply a pattern learner to a set of domain-specific documents to generate extraction patterns that comprise the IE system. This limits the coverage of the system to the expressions and language constructs used within the training data. This research exploits the vast quantities of text readily… (More)


7 Figures and Tables