Vedrana Vidulin

Learn More
A web page is a complex document which can share conventions of several genres, or contain several parts, each belonging to a different genre. To properly address the genre interplay, a recent proposal in automatic web genre identification is multi-label classification. The dominant approach to such classification is to transform one multi-label machine(More)
We present initial results from an international and multidisciplinary research collaboration that aims at the construction of a reference corpus of web genres. The primary application scenario for which we plan to build this resource is the automatic identification of web genres. Web genres are rather difficult to capture and to describe in their entirety,(More)
Abbreviated title: Impact of High-Level Knowledge on Economy through IDM □ This paper describes a novel algorithm for finding the most important relations with the use of data mining. As an example application, the impact of high-level knowledge on economic welfare was analyzed. Our approach, based on interactive data mining, not only helps to discover the(More)
Modern search engines aim at classifying web pages not only according to topics, but also according to genres. This paper presents the results of an attempt to train a genre classifier. We present features extracted from a 20-genre corpus used for training the genre classifiers and the results of using different machine learning (ML) algorithms in the(More)
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected based on the literature and the observation of the corpus. Thirdly, these features were extracted from the corpus to obtain a data set. Finally, two machine learning algorithms, one(More)
Has greater investment in education and research and development (R&D) a positive impact on economic welfare? We analyzed this question using the Weka machine learning and data mining systems. We collected data from the statistical databases for the year 2001. The obtained classification trees show that the level of participation in higher levels of(More)
MOTIVATION The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis(More)