Learn More
Sentence reduction is the removal of redundant words or phrases from an input sentence by creating a new sentence in which the gist of the original meaning of the sentence remains unchanged. All previous methods required a syntax parser before sentences could be reduced; hence it was difficult to apply them to a language with no reliable parser. In this(More)
Document clustering, the grouping of documents into several clusters, has been recognized as a means for improving efficiency and effectiveness of information retrieval and text mining. With the growing importance of electronic media for storing and exchanging large textual databases, document clustering becomes more significant. Hierarchical document(More)
Time series clustering has attracted increasing interest in the last decade, particularly for long time series such as those arising in the bioinformatics and financial domains. The widely known curse of dimensionality problem indicates that high dimensionality not only slows the clustering process, but also degrades it. Many feature extraction techniques(More)
0957-4174/$ see front matter 2009 Elsevier Ltd. A doi:10.1016/j.eswa.2009.02.026 * Corresponding author. E-mail addresses: zhangwen@jaist.ac.jp (W. Zh Yoshida), xjtang@amss.ac.cn (X. Tang). One of the deficiencies of mutual information is its poor capacity to measure association of words with unsymmetrical co-occurrence, which has large amounts for(More)
This paper investigates a novel application of support vector machines (SVMs) for sentence reduction. We also propose a new probabilistic sentence reduction method based on support vector machine learning. Experimental results show that the proposed methods outperform earlier methods in term of sentence reduction performance.
We study the problem of evaluating the goodness of a kernel matrix for a classification task. As kernel matrix evaluation is usually used in other expensive procedures like feature and model selections, the goodness measure must be calculated efficiently. Most previous approaches are not efficient, except for Kernel Target Alignment (KTA) that can be(More)
The high generalization ability of support vector machines (SVMs) has been shown in many practical applications, however, they are considerably slower in test phase than other learning approaches due to the possibly big number of support vectors comprised in their solution. In this letter, we describe a method to reduce such number of support vectors. The(More)