Sophia Katrenko

Learn More
Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all(More)
MOTIVATION Protein-protein interactions (PPIs) play an important role in understanding biological processes. Although recent research in text mining has achieved a significant progress in automatic PPI extraction from literature, performance of existing systems still needs to be improved. RESULTS In this study, we propose a novel algorithm for extracting(More)
In this paper, we describe our approach to the Genic Interaction Extraction Challenge. Our solution combines several elements: 1) a domain theory about the interaction between language, semantics and syntax, 2) a biological ontology identifying amongst other things biomolecular entities and directed interaction verbs in the lexicon, 3) the notion of(More)
We introduce a new semantic annotation scheme for the Recognizing Textual Entailment (RTE) dataset as well as a manually annotated dataset that uses this scheme. The scheme addresses three types of modification that license entailment patterns: restrictive, appositive and conjunctive, with a formal semantic specification of these patterns’ contribution for(More)
Automatic acquisition of qualia structures is one of the directions in information extraction that has received a great attention lately. We consider such information as a possible input for the word-space models and investigate its impact on the categorization task. We show that the results of the categorization are mostly influenced by the formal role(More)
Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web(More)