Kimmo Valtonen

Learn More
We have examined methods and developed a general software tool for finding and analyzing combinations of transcription factor binding sites that occur relatively often in gene upstream regions (putative promoter regions) in the yeast genome. Such frequently occurring combinations may be essential parts of possible promoter classes. The regions upstream to(More)
In this survey-style paper we demonstrate the usefulness of the probabilistic modelling framework in solving not only the actual positioning problem, but also many related problems involving issues like calibration, active learning, error estimation and tracking with history. We also point out some interesting links between positioning research done in the(More)
ALVIS researches the design, use and interoperability of topic-specific search engines with the goal of developing an open source prototype of a peer-to-peer, semantic-based search engine. Our approach is not the traditional Semantic Web approach with coded meta-data, but rather an engine that can build on content through semi-automatic analysis. This paper(More)
Probabilistic modeling techniques offer a unifying theoretical framework for solving the problems encountered when developing location-aware and location-sensitive applications in wireless radio networks. In this paper we demonstrate the usefulness of the probabilistic modelling framework in solving not only the actual location estimation (positioning)(More)
There has been mixed success in applying semantic component analysis (LSA, PLSA, discrete PCA, etc.) to information retrieval. Here we combine topic-specific link analysis with discrete PCA (a semantic component method) to develop a topic relevancy score for information retrieval that is used in post-filtering documents retrieved via regular Tf.Idf methods.(More)
In electrofishing it is usually assumed that the abundance of fish at a site is strongly dependent on habitat type. In practice the yearly choices of sites are not perfectly representative of the distribution of habitat types in a river, so a bias is introduced into density estimates based on the observed densities. However, it is assumed that this bias is(More)
From the management point of view, the production of wild smolts is the most important indicator of the status of a river’s salmon population. We present a methodology allowing the prediction of the number of wild smolts in a river in a consistent and well-defined fashion. Our framework is probabilistic and our approach Bayesian. Our models are Bayesian(More)
We present a methodology allowing the transfer of knowledge from a wild salmon river to another via a predictive model for the chosen population status indicator. From the management point of view, the production of wild smolts is the most important of such indicators. However, in our real-world data from Finnish and Swedish Gulf of Bothnia rivers we only(More)
There has been mixed success in applying semantic component analysis (LSA, PLSA, discrete PCA, etc.) to information retrieval. Previous experiments have shown that high-fidelity language models do not imply good quality retrieval. Here we combine link analysis with discrete PCA (a semantic component method) to develop an auxiliary score for information(More)
  • 1