Learn More
With the WEBSOM method a textual document collection may be organized onto a graphi-cal map display that provides an overview of the collection and facilitates interactive browsing. Interesting documents can be located on the map using a content-directed search. Each document is encoded as a histogram of word categories which are formed by the(More)
Powerful methods for interactive exploration and search from collections of free-form textual documents are needed to manage the ever-increasing flood of digital information. In this article we present a method, WEBSOM, for automatic organization of full-text document collections using the self-organizing map (SOM) algorithm. The document collection is(More)
We present Likey, a language-independent keyphrase extraction method based on statistical analysis and the use of a reference corpus. Likey has a very lightweight pre-processing phase and no parameters to be tuned. Thus, it is not restricted to any single language or language family. We test Likey having exactly the same configuration with 11 European(More)
We explore the use of independent component analysis (ICA) for the automatic extraction of linguistic roles or features of words. The extraction is based on the unsupervised analysis of text corpora. We contrast ICA with singular value decomposition (SVD), widely used in statistical text analysis, in general, and specifically in latent semantic analysis(More)
Likey is an unsupervised statistical approach for keyphrase extraction. The method is language-independent and the only language-dependent component is the reference corpus with which the documents to be analyzed are compared. In this study, we have also used another language-dependent component: an English-specific Porter stemmer as a pre-processing step.(More)
On January 19, 1996 we published in the Internet a demo of how to use Self-Organizing Maps (SOMs) for the organization of large collections of full-text les. Later we added other newsgroups to the demo. It can be found at the address http://websom.hut../websom/. In the present paper we describe the main features of this system, called the WEBSOM, as well as(More)
In time series prediction, one does often not know the properties of the underlying system generating the time series. For example, is it a closed system that is generating the time series or are there any external factors influencing the system? As a result of this, you often do not know beforehand whether a time series is stationary or nonstationary, and(More)
— Serious efforts to develop computerized systems for natural language understanding and machine translation have taken place for more than half a century. Some successful systems that translate texts in limited domains such as weather forecasts have been implemented. However, the more general the domain or complex the style of the text the more difficult(More)