June-Jei Kuo

Learn More
Summary generation for multiple documents poses a number of issues including sentence selection, sentence ordering, and sentence reduction over single-document summarization. In addition, the temporal resolution among extracted sentences is also important. This article considers informative words and event words to deal with multidocument summarization.(More)
This paper will propose a personal news secretariat that helps on-line readers absorb news information from multiple sources. Such a news secretariat eliminates the redundant information in the news, and reorganizes the news for readers. This multiple document summarization employs named entities and other signatures to cluster news stream; employs(More)
To measure the similarity of words, sentences, and documents is one of the major issues in multilingual multi-document sum-marization. This paper presents five strategies to compute the multilingual sentence similarity. The experimental results show that sentence alignment without considering the word position or order in a sentence obtains the best(More)
Unification of the terminology usages which captures more term semantics is useful for event clustering. This paper proposes a metric of normalized chain edit distance to mine controlled vocabulary from cross-document co-reference chains incrementally. A novel threshold model that incorporates time decay function and spanning window utilizes the controlled(More)
Event clustering on streaming news aims to group documents by events automatically. This paper employs co-reference chains to extract the most representative sentences, and then uses them to select the most informative features for clustering. Due to the long span of events, a fixed threshold approach prohibits the latter documents to be clustered and thus(More)
To reduce both the text size and the information loss during summarization, a multi-document summarization system using informative words is proposed. The procedure to extract informative words from multiple documents and generate summaries is described in this paper. At first, a small-scale experiment with 12 events and 60 questions was made. The results(More)