Learn More
Inspired by previous preprocessing approaches to SMT, this paper proposes a novel, probabilistic approach to reordering which combines the merits of syntax and phrase-based SMT. Given a source sentence and its parse tree, our method generates, by tree operations, an n-best list of reordered inputs, which are then fed to standard phrase-based decoder to(More)
This paper proposes a method that automatically generates questions from queries for community-based question answering (cQA) services. Our query-to-question generation model is built upon templates induced from search engine query logs. In detail, we first extract pairs of queries and user-clicked questions from query logs, with which we induce question(More)
BACKGROUND Recognition of binding sites in proteins is a direct computational approach to the characterization of proteins in terms of biological and biochemical function. Residue preferences have been widely used in many studies but the results are often not satisfactory. Although different amino acid compositions among the interaction sites of different(More)
Ecological psychology has much to contribute as a theory of design for instructional and learning systems. With its roots in the psychology of James Gibson (1986), present day ecological psychology provides a unique understanding of how students think and learn, and further, how technology can enhance thinking and learning. This paper explores ecological(More)
Semantic similarity is a fundamental concept and widely researched and used in the fields of natural language processing. However, methodologies for measuring semantic similarity are language-dependent. This paper presents a system similarity based measure of semantic similarity for Chinese words from HowNet, an online bilingual (Chinese-English) common(More)
Biological named entity recognition is a critical task for automatically mining knowledge from biological literature. In this paper, this task is cast as a sequential labeling problem and Conditional Random Fields model is introduced to solve it. Under the framework of Conditional Random Fields model, rich features including literal, context and semantics(More)
The imbalanced sentiment distribution of microblogs induces bad performance of binary classifiers on the minority class. To address this problem, we present a semi-supervised method for sentiment classification of Chinese microblogs. This method is similar to self-training, except that, a set of labeled samples is reserved for a confidence scores computing(More)