Learn More
In this paper, we address a novel method of Web query expansion by using WordNet and TSN. WordNet is an online lexical dictionary which describes word relationships in three dimensions of Hypernym, Hyponym and Synonym. And their impacts to expansions are different. We provide quantitative descriptions of the query expansion impact along each dimension.(More)
We propose and study a new ranking problem in versioned databases. Consider a database of versioned objects which have different valid instances along a history (e.g., documents in a web archive). Durable top-<i>k</i> search finds the set of objects that are consistently in the top-<i>k</i> results of a query (e.g., a keyword query) throughout a given time(More)
In order to index Web images, the whole associated texts are partitioned into a sequence of text blocks, then the local relevance of a term to the corresponding image is calculated with respect to both its local occurrence in the block and the distance of the block to the image. Thus, the overall relevance of a term is determined as the sum of all its local(More)
A Geo-Social Network (GeoSN) couples social network with location-based services (LBS). ● Find friends within a range of a Point of Interest (POI). ● Foursquare ○ 30M users, millions of check-ins per day. ● More and more users are using mobile devices to access social networks ● No industry white papers documenting the processing of queries. ○ Proximity(More)
The online shortest path problem aims at computing the shortest path based on live traffic circumstances. This is very important in modern car navigation systems as it helps drivers to make sensible decisions. To our best knowledge, there is no efficient system/solution that can offer affordable costs at both client and server sides for online shortest path(More)
Discovering motifs in sequence databases has been receiving abundant attentions from both database and data mining communities, where the motif is the most correlated pair of subsequences in a sequence object. Motif discovery is expensive for emerging applications which may have very long sequences (e.g., million observations per sequence) or the queries(More)
Given a point set <i>P</i> of customers (e.g., WiFi receivers) and a point set <i>Q</i> of service providers (e.g., wireless access points), where each <i>q</i> &#8712; <i>Q</i> has a capacity <i>q.k</i>, the <i>capacity constrained assignment</i> (CCA) is a matching <i>M</i> &#8838; <i>Q</i> &#215; <i>P</i> such that (i) each point <i>q</i> &#8712;(More)
Deducing trip related information from web-scale datasets has received very large amounts of attention recently. Identifying points of interest (POIs) in geo-tagged photos is one of these problems. The problem can be viewed as a standard clustering problem of partitioning two dimensional objects. In this work, we study spectral clustering which is the first(More)
Peer reviewing is a standard process for assessing the quality of submissions at academic conferences and journals. A very important task in this process is the assignment of reviewers to papers. However, achieving an appropriate assignment is not easy, because all reviewers should have similar load and the subjects of the assigned papers should be(More)