Learn More
In this paper, we propose a multi-criteria based active learning approach and effectively apply it to named entity recognition. Active learning targets to minimize the human annotation efforts by selecting examples for labeling. To maximize the contribution of the selected examples, we consider the multiple criteria: informativeness, representativeness and(More)
MOTIVATION With an overwhelming amount of textual information in molecular biology and biomedicine, there is a need for effective and efficient literature mining and knowledge discovery that can help biologists to gather and make use of the knowledge encoded in text documents. In order to make organized and structured information available, automatically(More)
Artemisinins, derived from the wormwood herb Artemisia annua, are the most potent antimalarial drugs currently available. Despite extensive research, the exact mode of action of artemisinins has not been established. Here we use yeast, Saccharamyces cerevisiae, to probe the core working mechanism of this class of antimalarial agents. We demonstrate that(More)
This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a(More)
Shallow semantic parsing, the automatic identification and labeling of sentential constituents, has recently received much attention. Our work examines whether semantic role information is beneficial to question answering. We introduce a general framework for answer extraction which exploits semantic role annotations in the FrameNet paradigm. We view(More)
In this paper, we explore how to adapt a general Hidden Markov Model-based named entity recognizer effectively to biomedical domain. We integrate various features, including simple deterministic features, morphological features, POS features and semantic trigger features, to capture various evidences especially for biomedical named entity and evaluate their(More)
Information retrieval has become more and more important due to the rapid growth of all kinds of information. However, there are few suitable systems available. This paper presents a few approaches that enable large-scale information retrieval for the TELLTALE system. TELLTALE is a dynamic hypertext information retrieval environment. It provides full-text(More)
Endogenous non-coding small RNAs (21-24 nt) play an important role in post-transcriptional gene regulation in plants. Domestication selection is the most important evolutionary force in shaping crop genomes. The extent of polymorphism at small RNA loci in domesticated rice and whether small RNA loci are targets of domestication selection have not yet been(More)
The purpose of this research is to enhance an HMM-based named entity recognizer in the biomedical domain. First, we analyze the characteristics of biomedical named entities. Then, we propose a rich set of features, including orthographic, morphological, part-of-speech, and semantic trigger features. All these features are integrated via a Hidden Markov(More)