Learn More
In this paper, we address a novel method of Web query expansion by using WordNet and TSN. WordNet is an online lexical dictionary which describes word relationships in three dimensions of Hypernym, Hyponym and Synonym. And their impacts to expansions are different. We provide quantitative descriptions of the query expansion impact along each dimension.(More)
Deducing trip related information from web-scale datasets has received very large amounts of attention recently. Identifying points of interest (POIs) in geo-tagged photos is one of these problems. The problem can be viewed as a standard clustering problem of partitioning two dimensional objects. In this work, we study spectral clustering which is the first(More)
Given a set of users, their friend relationships, and a distance threshold per friend pair, the proximity detection problem is to find each pair of friends such that the Euclidean distance between them is within the given threshold. This problem plays an essential role in friend-locator applications and massively multiplayer online games. Existing proximity(More)
We propose and study a new ranking problem in versioned databases. Consider a database of versioned objects which have different valid instances along a history (e.g., documents in a web archive). Durable top-<i>k</i> search finds the set of objects that are consistently in the top-<i>k</i> results of a query (e.g., a keyword query) throughout a given time(More)
Consider an internship assignment system, where at the end of each academic year, interested university students search and apply for available positions, based on their preferences (e.g., nature of the job, salary, office location, etc). In a variety of facility, task or position assignment contexts, users have personal preferences expressed by different(More)
Discovering motifs in sequence databases has been receiving abundant attentions from both database and data mining communities, where the motif is the most correlated pair of subsequences in a sequence object. Motif discovery is expensive for emerging applications which may have very long sequences (e.g., million observations per sequence) or the queries(More)
Given a point set <i>P</i> of customers (e.g., WiFi receivers) and a point set <i>Q</i> of service providers (e.g., wireless access points), where each <i>q</i> &#8712; <i>Q</i> has a capacity <i>q.k</i>, the <i>capacity constrained assignment</i> (CCA) is a matching <i>M</i> &#8838; <i>Q</i> &#215; <i>P</i> such that (i) each point <i>q</i> &#8712;(More)
Consider a set of servers and a set of users, where each server has a coverage region (i.e., an area of service) and a capacity (i.e., a maximum number of users it can serve). Our task is to assign every user to one server subject to the coverage and capacity constraints. To offer the highest quality of service, we wish to minimize the average distance(More)
In order to index Web images, the whole associated texts are partitioned into a sequence of text blocks, then the local relevance of a term to the corresponding image is calculated with respect to both its local occurrence in the block and the distance of the block to the image. Thus, the overall relevance of a term is determined as the sum of all its local(More)