June-Jei Kuo

Learn More
This article proposes a summarization system for multiple documents. It employs not only named entities and other signatures to cluster news from different sources, but also employs punctuation marks, linking elements, and topic chains to identify the meaningful units (MUs). Using nouns and verbs to identify the similar MUs, fo-cusing and browsing models(More)
To measure the similarity of words, sentences, and documents is one of the major issues in multilingual multi-document sum-marization. This paper presents five strategies to compute the multilingual sentence similarity. The experimental results show that sentence alignment without considering the word position or order in a sentence obtains the best(More)
Event clustering on streaming news aims to group documents by events automatically. This paper employs co-reference chains to extract the most representative sentences, and then uses them to select the most informative features for clustering. Due to the long span of events, a fixed threshold approach prohibits the latter documents to be clustered and thus(More)
Unification of the terminology usages which captures more term semantics is useful for event clustering. This paper proposes a metric of normalized chain edit distance to mine controlled vocabulary from cross-document co-reference chains incrementally. A novel threshold model that incorporates time decay function and spanning window utilizes the controlled(More)
Summary generation for multiple documents poses a number of issues including sentence selection, sentence ordering, and sentence reduction over single-document summarization. In addition, the temporal resolution among extracted sentences is also important. This article considers informative words and event words to deal with multidocument summarization.(More)
To reduce both the text size and the information loss during summarization, a multi-document summarization system using informative words is proposed. The procedure to extract informative words from multiple documents and generate summaries is described in this paper. At first, a small-scale experiment with 12 events and 60 questions was made. The results(More)