Georgia Koutrika

Learn More
Social bookmarking is a recent phenomenon which has the potential to give us a great deal of data about pages on the web. One major question is whether that data can be used to augment systems like web search. To answer this question, over the past year we have gathered what we believe to be the largest dataset from a social bookmarking site yet analyzed by(More)
In recent years, social Web sites have become important components of the Web. With their success, however, has come a growing influx of spam. If left unchecked, spam threatens to undermine resource sharing, interactivity, and openness. This article surveys three categories of potential countermeasures - those based on detection, demotion, and prevention.(More)
Entity Resolution (ER) is the problem of identifying which records in a database refer to the same real-world entity. An exhaustive ER process involves computing the similarities between pairs of records, which can be very expensive for large datasets. Various blocking techniques can be used to enhance the performance of ER by dividing the records into(More)
As information becomes available in increasing amounts to a wide spectrum of users, the need for a shift towards a more user-centered information access paradigm arises. We develop a personalization framework for database systems based on user profiles and identify the basic architectural modules required to support it. We define a preference model that(More)
Tagging systems allow users to interactively annotate a pool of shared resources using descriptive tags. As tagging systems are gaining in popularity, they become more susceptible to <i>tag spam:</i> misleading tags that are generated in order to increase the visibility of some resources or simply to confuse users. We introduce a framework for modeling(More)
Preferences have been traditionally studied in philosophy, psychology, and economics and applied to decision making problems. Recently, they have attracted the attention of researchers in other fields, such as databases where they capture soft criteria for queries. Databases bring a whole fresh perspective to the study of preferences, both computational and(More)
Recommendation systems have become very popular but most recommendation methods are `hard-wired' into the system making experimentation with and implementation of new recommendation paradigms cumbersome. In this paper, we propose <i>FlexRecs</i>, a framework that decouples the definition of a recommendation process from its execution and supports flexible(More)
Tagging systems allow users to interactively annotate a pool of shared resources using descriptive strings called <i>tags</i>. Tags are used to guide users to interesting resources and help them build communities that share their expertise and resources. As tagging systems are gaining in popularity, they become more susceptible to <i>tag spam</i>:(More)
We examine the creation of a tag cloud for exploring and understanding a set of objects (e.g., web pages, documents). In the first part of our work, we present a formal system model for reasoning about tag clouds. We then present metrics that capture the structural properties of a tag cloud, and we briefly present a set of tag selection algorithms that are(More)