Data Set Used
MOTIVATION Protein-protein interactions (PPIs) play an important role in understanding biological processes. Although recent research in text mining has achieved a significant progress in automatic PPI extraction from literature, performance of existing systems still needs to be improved. RESULTS In this study, we propose a novel algorithm for extracting… (More)
This paper discusses the problem of marrying structural similarity with semantic relat-edness for Information Extraction from text. Aiming at accurate recognition of relations, we introduce local alignment kernels and explore various possibilities of using them for this task. We give a definition of a local alignment (LA) kernel based on the Smith-Waterman… (More)
We introduce a new semantic annotation scheme for the Recognizing Textual Entailment (RTE) dataset as well as a manually annotated dataset that uses this scheme. The scheme addresses three types of modification that license entailment patterns: restrictive, appositive and conjunctive, with a formal semantic specification of these patterns' contribution for… (More)
We discuss four linguistic ontology-mapping techniques and evaluate them on real-life ontologies in the domain of food. Furthermore we propose a method to combine ontology-mapping techniques with high Precision and Recall to reduce the necessary amount of manual labor and computation.
This paper discusses local alignment kernels in the context of the relation extraction task. We define a local alignment kernel based on the Smith-Waterman measure as a sequence similarity metric and proceed with a range of possibilities for computing a similarity between elements of sequences. We propose to use distri-butional similarity measures on… (More)
This paper presents an approach to detection of the semantic types of relation arguments employing the WordNet hierarchy. Using the SemEval-2007 data, we show that the method allows to generalize relation arguments with high precision for such generic relations as Origin-Entity, Content-Container, Instrument-Agency and some other.
This paper revisits a problem of the evaluation of computational grammatical inference (GI) systems and discusses what role complexity measures can play for the assessment of GI. We provide a motivation for using the Rademacher complexity and give an example showing how this complexity measure can be used in practice.
The TREC Genomics 2007 task included recognizing topic-specific entities in the returned passages. To address this task, we have designed and implemented a novel data-driven approach by combining information extraction with language modeling techniques. Instead of using an exhaustive list of all possible instances for an entity type, we look at the language… (More)