Learn More
For the task of recognizing dialogue acts, we are applying the Transformation-Based Learning (TBL) machine learning algorithm. To circumvent a sparse data problem, we extract values of well-motivated features of utterances, such as speaker direction, punctuation marks, and a new feature, called dialogue act cues, which we find to be more effective than cue(More)
We present an empirical investigation of various ways to automatically identify phrases in a tagged corpus that are useful for dialogue act tagging. We found that a new method (which measures a phrase’s deviation from an optimally-predictive phrase), enhanced with a lexical filtering mechanism, produces significantly better cues than manually-selected cue(More)
A key aspect of any data integration endeavor is establishing a transformation that translates instances of one or more source schemata into instances of a target schema. This schema integration task must be tackled regardless of the integration architecture or mapping formalism. In this paper we provide a task model for schema integration. We use this(More)
Interest in information extraction from the biomedical literature is motivated by the need to speed up the creation of structured databases representing the latest scientific knowledge about specific objects, such as proteins and genes. This paper addresses the issue of a lack of standard definition of the problem of protein name tagging. We describe the(More)
Nef is a membrane-associated cytoplasmic phosphoprotein that is well conserved among the different human (HIV-1 and HIV-2) and simian immunodeficiency viruses and has important roles in down-regulating the CD4 receptor and modulating T-cell signaling pathways. The ability to modulate T-cell signaling pathways suggests that Nef may physically interact with(More)
We have determined the complete nucleotide sequence of human cellular c-myc, which is homologous to the transforming gene, v-myc, of myelocytomatosis virus MC29. Analysis of the genetic information and alignment with the known sequence of chicken c-myc and v-myc indicates: (i) An intervening sequence can be identified by consensus splice signals. The unique(More)
We have examined human T-lymphotropic virus type I (HTLV-I) gene expression in the human T-cell line, C8166-45 (C81), as a model to define the gene products expressed from defective proviruses. C81 cells contain one complete and two different deleted proviral genomes. The internal deletions of the latter encompass most of the gag to env region. All three(More)
To interpret natural language at the discourse level, it is very useful to accurately recognize dialogue acts, such as SUGGEST, in identifying speaker intentions. Our research explores the utility of a machine learning method called Transformation-Based Learning (TBL) in computing dialogue acts, because TBL has a number of advantages over alternative(More)
We introduce a significant improvement for a relatively new machine learning method called Transformation-Based Learning. By applying a Monte Carlo strategy to randomly sample from the space of rules, rather than exhaustively analyzing all possible rules, we drastically reduce the memory and time costs of the algorithm, without compromising accuracy on(More)
Submitted 12 May 2006; revised 24 Sep 2007; accepted 18 October 2007 Abstract. We are researching the interaction between the rule and the ontology layers of the Semantic Web, by comparing two options: 1) using OWL and its rule extension SWRL to develop an integrated ontology/rule language, and 2) layering rules on top of an ontology with RuleML and OWL.(More)