TopCells: Keyword-based search of top-k aggregated documents in text cube

@article{Ding2010TopCellsKS,
  title={TopCells: Keyword-based search of top-k aggregated documents in text cube},
  author={Bolin Ding and Bo Zhao and Cindy Xide Lin and Jiawei Han and ChengXiang Zhai},
  journal={2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)},
  year={2010},
  pages={381-384}
}
Previous studies on supporting keyword queries in RDBMSs provide users with a ranked list of relevant linked structures (e.g. joined tuples) or individual tuples. In this paper, we aim to support keyword search in a data cube with text-rich dimension(s) (so-called text cube). Each document is associated with structural dimensions. A cell in the text cube aggregates a set of documents with matching dimension values on a subset of dimensions. Given a keyword query, our goal is to find the top-k… 
Efficient Keyword-Based Search for Top-K Cells in Text Cube
TLDR
This paper defines a keyword-based query language and an IR-style relevance model for scoring/ranking cells in the text cube, and proposes four approaches to solve the problem of keyword search in a data cube with text-rich dimension(s) (so-called text cube): inverted-index one- scan, document sorted-scan, bottom-up dynamic programming, and search-space ordering.
Keyword search in text cube: Finding top-k relevant cells
TLDR
This work defines a keyword-based query language and applies IR-style relevance model for scoring and ranking cell documents in the text cube, and proposes two efficient approaches to find the top-k answers.
TEXplorer: keyword-based object search and exploration in multidimensional text databases
TLDR
A keyword-based interactive exploration framework that could offer flexible OLAP navigational guides and help users identify the levels and objects they are interested in is proposed and efficient algorithms and materialization strategies for ranking top-k dimensions and cells are proposed.
Multi-Dimensional, Phrase-Based Summarization in Text Cubes
TLDR
A cube-based analytical platform is developed that implements an efficient solution by materializing a deliberately selected part of statistics, and using these statistics to perform online query processing within a constant latency constraint, and demonstrates the efficiency in both query processing time and storage cost.
Keyword-based search and exploration on databases
  • Yi Chen, Wei Wang, Ziyang Liu
  • Computer Science
    2011 IEEE 27th International Conference on Data Engineering
  • 2011
TLDR
This tutorial gives an overview of the state-of-the-art techniques for supporting keyword-based search and exploration on databases and identifies the challenges and opportunities for future research to advance the field.
Efficient and Effective Aggregate Keyword Search on Relational Databases
TLDR
The authors propose a general ranking model and an efficient ranking algorithm that is efficient in both size and construction time and reports a systematic performance evaluation using real data sets.
A First Framework for Top-K Cubes Queries
TLDR
This paper proposes a first framework for Top-K cubes queries where queries are expressed in natural language to meet the easiness need of unskilled IT decision-makers and an implementation in a ROLAP architecture is provided.
Doc 2 Cube : Automated Document Allocation to Text Cube via Dimension-Aware Joint Embedding
Data cube is a cornerstone architecture in multidimensional analysis of structured datasets. It is highly desirable to conduct multidimensional analysis on text corpora with cube structures for
A Multi-dimensional Analysis and Data Cube for Unstructured Text and Social Media
TLDR
This paper extended the existing text cube model to incorporate TF-IDF (Term Frequency Inverse Document Frequrency) and LM (Language Model) as measurements, and revealed that the performance and the effectiveness of the proposed text cube outperform the existing one.
Doc2Cube: Allocating Documents to Text Cube Without Labeled Data
TLDR
Doc2Cube is proposed, a method that constructs a text cube from a given text corpus in an unsupervised way and alleviates label sparsity by propagating the information from label names to other terms and enriching the labeled term set.
...
1
2
3
4
...

References

SHOWING 1-10 OF 17 REFERENCES
SPARK2: Top-k Keyword Query in Relational Databases
TLDR
This paper proposes a new ranking formula by adapting existing IR techniques based on a natural notion of virtual document and proposes several efficient query processing methods for the new ranking method.
Answering Top-k Keyword Queries on Relational Databases
TLDR
A new ranking method based on virtual document that retrieves top-k keyword queries by ranking the results and also proposes Top-k CTT algorithm by using the frequency threshold value.
Answering aggregate keyword queries on relational databases using minimal group-bys
TLDR
This paper motivates a novel problem of aggregate keyword search: finding minimal group-bys covering a set of query keywords well, which is useful in many applications and develops two interesting approaches to tackle the problem.
Answering top-k queries with multi-dimensional selections: the ranking cube approach
TLDR
A new computational model, called ranking cube, is proposed, for efficient answering top-k queries with multi-dimensional selections, and a rank-aware measure is defined for the cube, capturing the goal of responding toMulti-dimensional ranking analysis.
Effective keyword search in relational databases
TLDR
This paper proposes a novel IR ranking strategy for effective keyword search and is the first that conducts comprehensive experiments on search effectiveness using a real world database and a set of keyword queries collected by a major search company.
BLINKS: ranked keyword searches on graphs
TLDR
BLINKS follows a search strategy with provable performance bounds, while additionally exploiting a bi-level index for pruning and accelerating the search, and offers orders-of-magnitude performance improvement over existing approaches.
Efficient IR-Style Keyword Search over Relational Databases
TLDR
This paper adapts IR-style document-relevance ranking strategies to the problem of processing free-form keyword queries over RDBMSs, and develops query-processing strategies that build on a crucial characteristic of IR- style keyword search: only the few most relevant matches are generally of interest.
EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data
TLDR
An extended inverted index is proposed to facilitate keyword-based search, and a novel ranking mechanism for enhancing search effectiveness is presented, which achieves both high search efficiency and high accuracy.
Text Cube: Computing IR Measures for Multidimensional Text Database Analysis
TLDR
This paper proposes a text-cube model on multidimensional text database and conducts systematic studies on efficient text-Cube implementation, OLAP execution and query processing and shows the high promise of the methods.
Keyword searching and browsing in databases using BANKS
TLDR
BANKS is described, a system which enables keyword-based search on relational databases, together with data and schema browsing, and presents an efficient heuristic algorithm for finding and ranking query results.
...
1
2
...