Learn More
This paper describes an overview of the Opinion Analysis Pilot Task from 2006 to 2007 at the Sixth NT-CIR Workshop. We created test collection for 32, 30, and 28 topics (11,907, 15,279, and 8,379 sentences) in Chinese, Japanese and English. Using this test collection , we conducted opinion extraction subtask. The subtask was defined from four perspectives:(More)
This paper describes an overview of the Multilin-Traditional Chinese, and Simplified Chinese. Using this test collection, we conducted five sub-tasks: (1) mandatory opinionated sentence judgment , and optional subtasks of (2) relevant sentence judgment, (3) polarity judgment, (4) opinion holder extraction, and (5) opinion target extraction. 32 results were(More)
Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these areas in a practical system for information access have been limited. In this paper, we present Columbia's Newsblaster system for online news summarization.(More)
We present the new multilingual version of the Columbia Newsblaster news summariza-tion system. The system addresses the problem of user access to browsing news from multiple languages from multiple sites on the internet. The system automatically collects, organizes, and summarizes news in multiple source languages , allowing the user to browse news topics(More)
The potential of automatically generated indexes for information acces s has been recognized for several decades (e.g., Bush 1945 [2], Edmundson and Wyllys 1961 [4]), but the quantity of text and the ambiguity of natural language processing have made progress at this task more difficult than was originally foreseen. Recently, a body of work on development(More)
Columbia's Newsblaster tracking and summa-rization system is a robust system that clusters news into events, categorizes events into broad topics and summarizes multiple articles on each event. Here we outline our most current work on tracking events over days, producing summaries that update a user on new information about an event, outlining the(More)
We have developed a multilingual version of Columbia Newsblaster as a testbed for multilingual multi-document summarization. The system collects, clusters, and summarizes news documents from sources all over the world daily. It crawls news sites in many different countries, written in different languages, extracts the news text from the HTML pages, uses a(More)
We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated(More)