Mirjana Ivanovic

Learn More
Different aspects of the curse of dimensionality are known to present serious challenges to various machine-learning methods and tasks. This paper explores a new aspect of the dimensionality curse, referred to as hubness, that affects the distribution of k-occurrences: the number of times a point appears among the k nearest neighbors of other points in a(More)
Social tagging systems have grown in popularity over the Web in the last years on account of their simplicity to categorize and retrieve content using open-ended tags. The increasing number of users providing information about themselves through social tagging activities caused the emergence of tag-based profiling approaches, which assume that users expose(More)
High-dimensional data arise naturally in many domains, and have regularly presented a great challenge for traditional data mining techniques, both in terms of effectiveness and efficiency. Clustering becomes difficult due to the increasing sparsity of such data, as well as the increasing difficulty in distinguishing distances between data points. In this(More)
High dimensionality can pose severe difficulties, widely recognized as different aspects of the curse of dimensionality. In this paper we study a new aspect of the curse pertaining to the distribution of <i>k</i>-occurrences, i.e., the number of times a point appears among the <i>k</i> nearest neighbors of other points in a data set. We show that, as(More)
The vector space model (VSM) is a popular and widely applied model in information retrieval (IR). VSM creates vector spaces whose dimensionality is usually high (e.g., tens of thousands of terms). This may cause various problems, such as susceptibility to noise and difficulty in capturing the underlying semantic structure, which are commonly recognized as(More)
Outlier detection in high-dimensional data presents various challenges resulting from the &#x201C;curse of dimensionality.&#x201D; A prevailing view is that distance concentration, i.e., the tendency of distances in high-dimensional data to become indiscernible, hinders the detection of outliers by making distance-based methods label all points as almost(More)
With the development of sophisticated e-learning environments, personalization is becoming an important feature in e-learning systems due to the differences in background, goals, capabilities and personalities of the large numbers of learners. Personalization can achieve using different type of recommendation techniques. This paper presents an overview of(More)