Performance Evaluation of Knowledge Extraction Methods

  title={Performance Evaluation of Knowledge Extraction Methods},
  author={J. M. Rodr{\'i}guez and H. Merlino and Patricia Pesado and Ram{\'o}n Garc{\'i}a-Mart{\'i}nez},
This paper shows the precision, the recall and the F-measure for the knowledge extraction methods (under Open Information Extraction paradigm): ReVerb, OLLIE and ClausIE. For obtaining these three measures a subset of 55 newswires corpus was used. This subset was taken from the Reuters-21578 text categorization and test collection database. A handmade relation extraction was applied for each one of these newswires. 
Evaluation of open information extraction methods using Reuters-21578 database
Although the correction improved the precision of Clausie, ReVerb turned out to be the most precise method; however ClausIE is the one with the better F1-measure. Expand
DptOIE: a portuguese Open Information Extraction system based on dependency analysis
The DptOIE method defined a new set of hand-craft rules and explore sentences through a dependency analysis by a depthfirst search (DFS) approach, which is the most outperforming method to extract fact on OIE for the Portuguese language. Expand
Automatic Characteristics Extraction for Sentiment Analysis Tasks
The obtained result shows that ClausIE can be used for the extraction of characteristics in a semi-automatic way, but it requires a minimum manual intervention that is explained in the results section. Expand
A systematic mapping study on open information extraction
A review of the literature in Open IE by a systematic mapping study, which retrieved 2484 articles about Open IE in Science Direct, IEEE Xplore, ACM Digital Library, Scopus and Google Scholar databases and identified significant gaps that could be envisioned as future works. Expand
Scalable Distributed Semantic Network for knowledge management in cyber physical system
A new scalable model for heterogeneous data representation and can extract more semantic information from different data sources is proposed, named Distributed Semantic Network (DSN), and experimental results show that DSN can better model the semantic information in the text. Expand
Automatización de la extracción de características en tareas de análisis de sentimiento
Resumen. El siguiente artículo propone la utilización de un método de extracción de conocimiento para la Web (OIE), en particular ClausIE, para la obtención de características de películas de formaExpand
Clasificación de distintos conjuntos de datos utilizados en evaluación de métodos de extracción de conocimiento creados para la web
Resumen. En varios artículos se han utilizado distintos textos de prueba, como datos de entrada para medir el desempeño de los métodos de extracción de relaciones semánticas para la Web (OIE): ReVerbExpand


Identifying Relations for Open Information Extraction
Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos. Expand
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitraryExpand
Open Information Extraction from the Web
Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input, is introduced. Expand
Open Information Extraction Using Wikipedia
WOE is presented, an open IE system which improves dramatically on TextRunner's precision and recall and is a novel form of self-supervised learning for open extractors -- using heuristic matches between Wikipedia infobox attribute values and corresponding sentences to construct training data. Expand
Unsupervised named-entity extraction from the Web: An experimental study
An overview of KnowItAll's novel architecture and design principles is presented, emphasizing its distinctive ability to extract information without any hand-labeled training examples, and three distinct ways to address this challenge are presented and evaluated. Expand
Knowledge discovery for knowledge based systems. Some experimental results
This paper addresses some considerations based on the state of the involved technologies for the integration of knowledge discovery systems and knowledge based systems centered in automatic knowledgeExpand
An analysis of open information extraction based on semantic role labeling
This work investigates the use of semantic role labeling techniques for the task of Open IE and compares SRL-based open extractors with TextRunner, an open extractor which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics. Expand
ReNoun: Fact Extraction for Nominal Attributes
ReNoun is described, an open information extraction system that complements previous efforts by focusing on nominal attributes and on the long tail, and experiments are described that show that it is possible to extract facts with high precision and for attributes that cannot be extracted with verb-based techniques. Expand
ClausIE: clause-based open information extraction
ClausIE is a novel, clause-based approach to open information extraction, which extracts relations and their arguments from natural language text using a small set of domain-independent lexica, operates sentence by sentence without any post-processing, and requires no training data. Expand
The Triplex Approach for Recognizing Semantic Relations from Noun Phrases, Appositions, and Adjectives
This work proposes Triplex, an information extractor that complements previous efforts, concentrating on noun-mediated triples related to nouns, adjectives, and appositions, and reports on an automatic evaluation method to examine the output of information extractors both with and without the Triplex approach. Expand