Learn More
Monitoring predefined patterns in streaming time series is useful to applications such as trend-related analysis, sensor networks and video surveillance. Most current studies on such monitoring employ Euclidean distance to calculate the similarities between given query patterns and subsequences of streaming time series. Euclidean distance has been shown to(More)
Supported by the technical advances and the commercial success of GPS-enabled mobile devices, geo-tagged photos have drawn plenteous attention in research community. The explosive growth of geo-tagged photos enables many large-scale applications, such as location-based photo browsing, landmark recognition, etc. Meanwhile, as the number of geo-tagged photos(More)
There is a trend that, virtually everyone, ranging from big Web companies to traditional enterprisers to physical science researchers to social scientists, is either already experiencing or anticipating unprecedented growth in the amount of data available in their world, as well as new opportunities and great untapped value. This paper reviews big data(More)
The volume of RDF data increases very fast within the last five years, e.g. the Linked Open Data cloud grows from 2 billions to 50 billions of RDF triples. With its wonderful scalability, cloud computing platform like Hadoop is a good choice for processing queries over large data sets. Previous works on evaluating SPARQL queries with Hadoop mainly focus on(More)
There are many entity-attribute tables on the Web that can be utilized for enriching the entities of knowledge bases (KBs). This requires the schema mapping (matching) between the Web tables and the huge KBs. Existing solutions on schema mapping are inadequate for mapping a Web table and a KB, because of many reasons such as (1) there are many duplicates of(More)
The string similarity join is a basic operation of many applications that need to find all string pairs from a collection given a similarity function and a user-specified threshold. Recently, there has been considerable interest in designing new algorithms with the assistant of an inverted index to support efficient string similarity joins. These algorithms(More)
Given a set of client locations, a set of facility locations where each facility has a service capacity, and the assumptions that: (i) a client seeks service from its nearest facility; (ii) a facility provides service to clients in the order of their proximity, we study the problem of selecting all possible locations such that setting up a new facility with(More)
Moving range query over RFID data streams is one of the most important spatio-temporal queries to support valuable information analysis. However, the location uncertainty challenges the query strategy. In this paper, we propose a probability evaluation model in the RFID-enabled monitoring environments and discuss the query optimization techniques under the(More)
The tagging technique has been widely applied in existing Web 2.0 systems, where users label resources with tags for effective classification and efficient retrieval of resources. Location-aware geographical tags (geo-tags) are required if users want to mark location-sensitive resources to digital maps. Large volumes of different kinds of user-created tags(More)