Learn More
Credibility is a perceived quality and is evaluated with at least two major components: trustworthiness and expertise. Weblogs (or blogs) are a potentially fruitful genre for exploration of credibility assessment due to public disclosure of information that might reveal trustworthiness and expertise by webloggers (or bloggers) and availability of audience(More)
In this paper, we describe the most recent implementation and evaluation of the proper noun categorization and standardization module of the DR-LINK document detection system being developed at Syracuse University, under the auspices of ARPA's TIPSTER program. We also discuss the expansion of group common nouns and group proper nouns to enhance retrieval(More)
The research in this paper describes a Machine Learning technique called hierarchical text categorization which is used to solve the problem of finding equivalents from among different state and national education standards. The approach is based on a set of manually aligned standards and utilizes the hierarchical structure present in the standards to(More)
1. Introduction Information retrieval (IR) research has reached a point where it is appropriate to assess progress and to define a research agenda for the next five to ten years. This report summarizes a discussion of IR research challenges that took place at a recent workshop. The attendees of the workshop considered information retrieval research in a(More)
This chapter presents a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. Our contribution is in a proposed categorization model and analytical framework for certainty identification. Certainty is presented as a type of subjective information available in texts.(More)
We have developed MetaExtract, a system to automatically assign Dublin Core + GEM metadata using extraction techniques from our natural language processing research MetaExtract is comprised of three distinct processes: eQuery and HTML-based Extraction modules and a Keyword Generator module. We conducted a Web-based survey to have users evaluate each(More)
The poster reports on a project in which we are investigating methods for breaking the human metadata-generation bottleneck that plagues Digital Libraries. The research question is whether metadata elements and values can be automatically generated from the content of educational resources, and correctly assigned to mathematics and science educational(More)
The text categorization module described here provides a front-end filtering function for the larger DR-LINK text retrieval system [Liddy and Myaeing 1993]. The model evaluates a large incoming stream of documents to determine which documents are sufficiently similar to a profile at the broad subject level to warrant more refined representation and(More)