Efficient Keyword-Based Search for Top-K Cells in Text Cube

@article{Ding2011EfficientKS,
  title={Efficient Keyword-Based Search for Top-K Cells in Text Cube},
  author={Bolin Ding and Bo Zhao and Cindy Xide Lin and Jiawei Han and ChengXiang Zhai and Ashok N. Srivastava and Nikunj C. Oza},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2011},
  volume={23},
  pages={1795-1810}
}
  • Bolin Ding, Bo Zhao, +4 authors N. Oza
  • Published 1 December 2011
  • Computer Science
  • IEEE Transactions on Knowledge and Data Engineering
Previous studies on supporting free-form keyword queries over RDBMSs provide users with linked structures (e.g., a set of joined tuples) that are relevant to a given keyword query. Most of them focus on ranking individual tuples from one table or joins of multiple tables containing a set of keywords. In this paper, we study the problem of keyword search in a data cube with text-rich dimension(s) (so-called text cube). The text cube is built on a multidimensional text database, where each row is… 
Finding Patterns in a Knowledge Base using Keywords to Compose Table Answers
TLDR
Two query-processing algorithms are proposed: one is fast in practice for small queries (with small numbers of patterns as answers) by utilizing the indexes; and the other one is better in theory, with running time linear in the sizes of indexes and answers, which can handle large queries better.
EFFICIENT KEYWORD SEARCH IN RELATIONAL DATABASES
TLDR
Indexing helps to easily retrieved answers and with the help of indexing the authors measure the performance of the CPU, execution time and Disk memory consumed.
Accelerating Topic Exploration of Multi-Dimensional Documents
  • H. Wen-Jing, Lu You, Lee Zhuo Qi
  • Computer Science
    2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
  • 2017
TLDR
The strategy is to create indexes for identifying documents, and pre-compute the topic models for certain subsets of documents that fall within the user-specified range, and guarantees to find the exact set of documents in O(lg^d n) time in the worst case.
Accelerating Topic Exploration of Multi-Dimensional Documents
TLDR
This paper aims to accelerate the computation of topic models for documents that satisfy range queries, and creates indexes for identifying documents, and combines the pre-computed models associated with the Canonical Set of documents.
EventCube: multi-dimensional search and mining of structured and text data
TLDR
This proposed EventCube demo will show the power of the system not only on the originally designed ASRS (Aviation Safety Report System) data sets, but also on news datasets collected from multiple news agencies, and academic datasets constructed from the DBLP and web data.
Structured query reformulations in commerce search
TLDR
This study proposes reformulating queries involving terms such as designer to queries that specify precise product attributes, and learns to rewrite the modifiers to attribute values by analyzing user behavior and leveraging structured data sources such as the product catalog that serves the queries.
A Scalable Document-Based Architecture for Text Analysis
TLDR
This paper proposes a new generic text analysis architecture, where document structure is flexible, many preprocessing techniques are integrated and textual datasets are indexed for efficient access.
D-Hive: Data Bees Pollinating RDF, Text, and Time
TLDR
D-Hive is put forward, a system facilitating analytics over RDF-style (SPO) triples augmented with text and (validity / transaction) time capable of addressing the functionality and scalability requirements which current solutions cannot meet.
A NOVEL APPROACH ON SPATIAL OBJECTS FOR OPTIMAL ROUTE SEARCH USING BEST KEYWORD COVER QUERY
Optimal route search using spatial keyword query focus on keyword searching using best keyword cover query which is a form of spatial keyword query. It operates on spatial objects stored in spatial
Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications.
TLDR
A protocol for a cloud-based environment supporting the end-to-end phrase-mining and analyses platform CaseOLAP, which successfully quantifies user-defined phrase-category relationships through the analysis of textual data.
...
1
2
3
...

References

SHOWING 1-10 OF 48 REFERENCES
TopCells: Keyword-based search of top-k aggregated documents in text cube
TLDR
This paper aims to support keyword search in a data cube with text-rich dimension(s) (so-called text cube) by proposing a relevance scoring model and efficient ranking algorithms.
Effective keyword search in relational databases
TLDR
This paper proposes a novel IR ranking strategy for effective keyword search and is the first that conducts comprehensive experiments on search effectiveness using a real world database and a set of keyword queries collected by a major search company.
Querying Communities in Relational Databases
TLDR
This paper proposes new efficient algorithms to find all/top-k communities which consume small memory, for an l-keyword query, and conducts extensive performance studies using two large real datasets to confirm the efficiency of the algorithms.
Answering top-k queries with multi-dimensional selections: the ranking cube approach
TLDR
A new computational model, called ranking cube, is proposed, for efficient answering top-k queries with multi-dimensional selections, and a rank-aware measure is defined for the cube, capturing the goal of responding toMulti-dimensional ranking analysis.
BLINKS: ranked keyword searches on graphs
TLDR
BLINKS follows a search strategy with provable performance bounds, while additionally exploiting a bi-level index for pruning and accelerating the search, and offers orders-of-magnitude performance improvement over existing approaches.
Efficient IR-Style Keyword Search over Relational Databases
TLDR
This paper adapts IR-style document-relevance ranking strategies to the problem of processing free-form keyword queries over RDBMSs, and develops query-processing strategies that build on a crucial characteristic of IR- style keyword search: only the few most relevant matches are generally of interest.
Answering aggregate keyword queries on relational databases using minimal group-bys
TLDR
This paper motivates a novel problem of aggregate keyword search: finding minimal group-bys covering a set of query keywords well, which is useful in many applications and develops two interesting approaches to tackle the problem.
Finding Top-k Min-Cost Connected Trees in Databases
TLDR
This paper proposes a novel parameterized solution, with l as a parameter, to find the optimal GST-1, in time complexity O(3ln + 2l ((l + logn)n + m), where n and m are the numbers of nodes and edges in graph G, which can handle graphs with a large number of nodes.
Keyword searching and browsing in databases using BANKS
TLDR
BANKS is described, a system which enables keyword-based search on relational databases, together with data and schema browsing, and presents an efficient heuristic algorithm for finding and ranking query results.
...
1
2
3
4
5
...