Evaluation of open information extraction methods using Reuters-21578 database

  title={Evaluation of open information extraction methods using Reuters-21578 database},
  author={J. M. Rodr{\'i}guez and H. Merlino and Patricia Pesado and Ram{\'o}n Garc{\'i}a-Mart{\'i}nez},
  booktitle={ICMLSC '18},
The following article shows the precision, the recall and the F1-measure for three knowledge extraction methods under Open Information Extraction paradigm. These methods are: ReVerb, OLLIE and ClausIE. For the calculation of these three measures, a representative sample of Reuters-21578 was used; 103 newswire texts were taken randomly from that database. A big discrepancy was observed, after analyzing the obtained results, between the expected and the observed precision for ClausIE. In order to… Expand
ATP-OIE: An Autonomous Open Information Extraction Method
This paper describes an innovative Open Information Extraction method known as ATP-OIE1. It utilizes extraction patterns to find semantic relations. These patterns are generated automatically fromExpand
An Approach of Web Scraping on News Website based on Regular Expression
It is found that this approach is a simple and strait forward way to extract news article which consists of title, publication date, author, news article, and the URL address of news article. Expand


Performance Evaluation of Knowledge Extraction Methods
The precision, the recall and the F-measure for the knowledge extraction methods (under Open Information Extraction paradigm): ReVerb, OLLIE and ClausIE are shown. Expand
Identifying Relations for Open Information Extraction
Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos. Expand
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitraryExpand
ClausIE: clause-based open information extraction
ClausIE is a novel, clause-based approach to open information extraction, which extracts relations and their arguments from natural language text using a small set of domain-independent lexica, operates sentence by sentence without any post-processing, and requires no training data. Expand
Extracting information networks from the blogosphere
A new term-weighting scheme is proposed that significantly improves on the state-of-the-art in the task of relation extraction, both when used in conjunction with the standard tf ċ idf scheme and also when used as a pruning filter. Expand
Open Information Extraction from the Web
Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input, is introduced. Expand
Unsupervised named-entity extraction from the Web: An experimental study
An overview of KnowItAll's novel architecture and design principles is presented, emphasizing its distinctive ability to extract information without any hand-labeled training examples, and three distinct ways to address this challenge are presented and evaluated. Expand
Open Information Extraction Using Wikipedia
WOE is presented, an open IE system which improves dramatically on TextRunner's precision and recall and is a novel form of self-supervised learning for open extractors -- using heuristic matches between Wikipedia infobox attribute values and corresponding sentences to construct training data. Expand
An analysis of open information extraction based on semantic role labeling
This work investigates the use of semantic role labeling techniques for the task of Open IE and compares SRL-based open extractors with TextRunner, an open extractor which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics. Expand
ReNoun: Fact Extraction for Nominal Attributes
ReNoun is described, an open information extraction system that complements previous efforts by focusing on nominal attributes and on the long tail, and experiments are described that show that it is possible to extract facts with high precision and for attributes that cannot be extracted with verb-based techniques. Expand