Aynur A. Dayanik

Learn More
Supervised learning approaches to text classification are in practice often required to work with small and unsystematically collected training sets. The alternative to supervised learning is usually viewed to be building classifiers by hand, using a domain expert's understanding of which features of the text are related to the class of interest. This is(More)
This report describes DIMACS work on the text categoriza-tion task of the TREC 2005 Genomics track. Our approach to this task was similar to the triage subtask studied in the TREC 2004 Genomics track. We applied Bayesian logistic regression and achieved good effectiveness on all categories. The Mouse Genome Informatics (MGI) project of the Jackson(More)
Consider a supervised learning problem in which examples contain both numerical-and text-valued features. To use traditional feature-vector-based learning methods, one could treat the presence or absence of a word as a Boolean feature and use these binary-valued features together with the numerical features. However, the use of a text-classification system(More)
DIMACS participated in the text categorization and ad hoc retrieval tasks of the TREC 2004 Genomics track. For the categorization task, we tackled the triage and annotation hierarchy subtasks. The Mouse Genome Informatics (MGI) project of the Jackson Laboratory 1 provides data on the genetics, genomics, and biology of the laboratory mouse. In particular,(More)
Consider a supervised learning problem in which examples contain both numerical-and text-valued features. To use traditional feature-vector-based learning methods, one could treat the presence or absence of a word as a Boolean feature and use these binary-valued features together with the numerical features. However, the use of a text-classification system(More)
  • 1