Learn More
Given a large collection of objects, finding all pairs of similar objects, namely similarity join, is widely used to solve various problems in many application domains.Computation time of similarity join is critical issue, since similarity join requires computing similarity values for all possible pairs of objects. Several existing algorithms adopt prefix(More)
An increasing number of Web applications are allowing users to play more active roles for enriching the source content. The enriched data can be used for various applications such as text summarization, opinion mining and ontology creation. In this paper, we propose a novel Web content summarization method that creates a text summary by exploiting user(More)
In this paper, we propose a novel web content summarization method that creates a text summary by exploiting user feedback (comments, annotations etc.) in the social bookmarking service. We first analyze the feasibility to utilize user feedback in the summarization, and then demonstrate how the social summary which best represents the topics of the web(More)
Keyword search can provide users an easy method to query large and complex databases without any knowledge of structured query languages or underlying database schema. Most of the existing studies have focused on generating candidate structured queries relevant to keywords. Due to the large size of generated queries, the execution costs may be prohibitive.(More)
This paper aims at presenting a case study of designing and implementing a data ingestion system for manufacturers. In our implementation, clustered server architecture for high throughput data ingestion is proposed with regard to following factors: receiving stream data, i.e., machine logs, from a set of milling machines, storing them in a centralized(More)
This paper introduces a novel user model, social interaction propensity model, for computing similarity of mobile phone users. Traditional studies exploit the usage history to represent the users by their behavioral patterns. This representation model requires prohibitive costs for dealing with the high-dimensional space that contains the usage patterns(More)
This work presents a novel ranking scheme for structured data. We show how to apply the notion of typicality analysis from cognitive science and how to use this notion to formulate the problem of ranking data with categorical attributes. First, we formalize the typicality query model for relational databases. We adopt Pearson correlation coefficient to(More)