Learn More
Motivated by the need for unification of the field of data mining and the growing demand for formalized representation of outcomes of research, we address the task of constructing an ontology of data mining. The proposed ontology, named OntoDM, is based on a recent proposal of a general framework for data mining, and includes definitions of basic data(More)
6 High quality information on forest resources is important to forest ecosystem management. Tra7 ditional ground measurements are labor and resource intensive and at the same time expensive 8 and time consuming. For most of the Slovenian forests, there is extensive ground-based infor9 mation on forest properties of selected sample locations. However there(More)
With the huge amount of information available online the World Wide Web is a fertile area for data mining research. The Web has become a major vehicle in performing research and education related activities for researches and students. Web mining is the use of data mining technologies to automatically interact and discover information from web documents,(More)
New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation are invaluable. We introduce a novel model for computational(More)
Motivated by the need for unification of the domain of data mining and the demand for formalized representation of outcomes of data mining investigations, we address the task of constructing an ontology of data mining. In this paper we present an updated version of the OntoDM ontology, that is based on a recent proposal of a general framework for data(More)
Genetically-modified (GM) crops increased their share in EU agriculture, so the adventitious presence of GM varieties in non-GM seeds and crops has become an issue and poses the problem of their co-existence with conventional and organic crops. Therefore, there is a need to propose appropriate measures at the farm and regional levels to minimize(More)
The purpose of this paper is two-fold: First, we give efficient algorithms for answering itemset support queries for collections of itemsets from various representations of the frequency information. As index structures we use itemset tries of transaction databases, frequent itemsets and their condensed representations. Second, we evaluate the usefulness of(More)
Multi-label classification (MLC) tasks are encountered more and more frequently in machine learning applications. While MLC methods exist for the classical batch setting, only a few methods are available for streaming setting. In this paper, we propose a new methodology for MLC via multi-target regression in a streaming setting. Moreover, we develop a(More)