Learn More
Query optimization is an integral part of relational database management systems. One important task in query optimization is selectivity estimation, that is, given a query <italic>P</italic>, we need to estimate the fraction of records in the database that satisfy <italic>P</italic>. Many commercial database systems maintain histograms to approximate the(More)
Computing multidimensional aggregates in high dimensions is a performance bottleneck for many OLAP applications. Obtaining the exact answer to an aggregation query can be prohibitively expensive in terms of time and/or storage space in a data warehouse environment. It is advantageous to have fast, approximate answers to OLAP aggregation queries. In this(More)
Integrating the extracted facts with an existing knowledge base has raised an urgent need to address the problem of entity linking. Specifically, entity linking is the task to link the entity mention in text with the corresponding real world entity in the existing knowledge base. However, this task is challenging due to name ambiguity, textual(More)
Multi-way Theta-join queries are powerful in describing complex relations and therefore widely employed in real practices. However, existing solutions from traditional distributed and parallel databases for multi-way Theta-join queries cannot be easily extended to fit a shared-nothing distributed computing paradigm, which is proven to be able to support(More)
Twitter has become an increasingly important source of information, with more than 400 million tweets posted per day. The task to link the named entity mentions detected from tweets with the corresponding real world entities in the knowledge base is called tweet entity linking. This task is of practical importance and can facilitate many different tasks,(More)
Prediction of popular items in online content sharing systems has recently attracted a lot of attention due to the tremendous need of users and its commercial values. Different from previous works that make prediction by fitting a popularity growth model, we tackle this problem by exploiting the latent <i>conforming</i> and <i>maverick</i> personalities of(More)
There has recently been an explosion of interest in the analysis of data in data warehouses in the eld of On-Line Analytical Processing (OLAP). Data warehouses can be extremely large, yet obtaining quick answers to queries is important. In many situations, obtaining the exact answer to an OLAP query is prohibitively expensive in terms of time and/or storage(More)
Global positioning system velocities from 553 control points within the Tibetan Plateau and on its margins show that the present-day tectonics in the plateau is best described as deformation of a continuous medium, at least when averaged over distances of Ͼϳ100 km. Deformation occurs throughout the plateau interior by ESE-WNW extension and slightly slower(More)