Learn More
This paper describes an overview of the Opinion Analysis Pilot Task from 2006 to 2007 at the Sixth NT-CIR Workshop. We created test collection for 32, 30, and 28 topics (11,907, 15,279, and 8,379 sentences) in Chinese, Japanese and English. Using this test collection , we conducted opinion extraction subtask. The subtask was defined from four perspectives:(More)
This paper describes an overview of the Multilin-Traditional Chinese, and Simplified Chinese. Using this test collection, we conducted five sub-tasks: (1) mandatory opinionated sentence judgment , and optional subtasks of (2) relevant sentence judgment, (3) polarity judgment, (4) opinion holder extraction, and (5) opinion target extraction. 32 results were(More)
The potential of automatically generated indexes for information acces s has been recognized for several decades (e.g., Bush 1945 [2], Edmundson and Wyllys 1961 [4]), but the quantity of text and the ambiguity of natural language processing have made progress at this task more difficult than was originally foreseen. Recently, a body of work on development(More)
We present the new multilingual version of the Columbia Newsblaster news summariza-tion system. The system addresses the problem of user access to browsing news from multiple languages from multiple sites on the internet. The system automatically collects, organizes, and summarizes news in multiple source languages , allowing the user to browse news topics(More)
Columbia's Newsblaster tracking and summa-rization system is a robust system that clusters news into events, categorizes events into broad topics and summarizes multiple articles on each event. Here we outline our most current work on tracking events over days, producing summaries that update a user on new information about an event, outlining the(More)
We have developed a multilingual version of Columbia Newsblaster as a testbed for multilingual multi-document summarization. The system collects, clusters, and summarizes news documents from sources all over the world daily. It crawls news sites in many different countries, written in different languages, extracts the news text from the HTML pages, uses a(More)
We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated(More)
In this paper I present a system for automatic opinion analysis built in a short time-frame using freely available open-source processing tools and lexical resources available from prior research. I use a simple feature-set that is largely language independent and a freely available machine-learning framework to model the subtasks as classification problems(More)
In this paper we introduce the NTCIR6 Opinion Analysis Pilot Task, information about the Chinese, Japanese, and English data, plans for future opinion analysis tasks at NTCIR, and a brief overview of the evaluation results. This pilot task is a sentence-level opinion identification and polarity detection task run over data from a comparable corpus in three(More)