From x100 to vectorwise: opportunities, challenges and things most researchers do not think about

@article{Zukowski2012FromXT,
  title={From x100 to vectorwise: opportunities, challenges and things most researchers do not think about},
  author={Marcin Zukowski and Peter A. Boncz},
  journal={Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data},
  year={2012}
}
  • M. Zukowski, P. Boncz
  • Published 2012
  • Biology, Computer Science
  • Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
In 2008 a group of researchers behind the X100 database kernel created Vectorwise: a spin-off which together with the Actian corporation (previously Ingres) worked on bringing this technology to the market. Today, Vectorwise is a popular product and one of the examples of conversion of a research prototype into successful commercial software. We describe here some of the interesting aspects of the work performed by the Vectorwise development team in the process, and discuss the opportunities… Expand
Slingshot: A modular framework for designing data processing systems
TLDR
Slingshot is introduced, a new data processing engine, where modularity and implementation flexibility are the top priority and it is shown that Slingshot outperforms the RDBMS in most cases, while performing comparably in others. Expand
A study of PosDB Performance in a Distributed Environment
PosDB is a new disk-based distributed column-store relational engine aimed for research purposes. It uses the Volcano pull-based model and late materialization for query processing, and join indexesExpand
A Novel Index Method for Write Optimization on Out-of-Core Column-Store Databases
TLDR
The purpose of this thesis is to extend previous research on write optimization in out-of-core column storage databases by exploring a new type of storage model titled Timestamped Binary Association Table (TBAT), a new update designed to leverage the TBAT, and a newtype of B-Tree titled Offset B Tree (OB-tree) which will be examined. Expand
Multi-level Parallel Query Execution Framework for CPU and GPU
TLDR
This work uses just-in-time compilation to execute whole OLAP queries on the GPU minimizing the overhead for transfer and synchronization, and describes several patterns, which can be used to build efficient execution plans and achieve the necessary parallelism. Expand
Columnar Storage and List-based Processing for Graph Database Management Systems
TLDR
This work revisits column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMS) and proposes novel ones that are optimized for GDBMSs, including a novel list-based query processor, a new data structure the authors call singleindexed edge property pages and an accompanying edge ID scheme. Expand
Examining database persistence of ISO/EN 13606 standardized electronic health record extracts: relational vs. NoSQL approaches
TLDR
Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Expand
Management of Flexible Schema Data in RDBMSs - Opportunities and Limitations for NoSQL -
TLDR
The engineering principles and practices to manage FSD in RDBMSs to meet FSD’s unique requirements and challenges are described and the limitations and issues of current practices are described. Expand
Integrating Column-Oriented Storage and Query Processing Techniques Into Graph Database Management Systems
TLDR
This work revisits column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMS) and proposes novel ones that are optimized for GDBMSs, including a novel listbased query processor. Expand
Calculating Fourier Transforms in SQL
TLDR
Implementing the Discrete Fourier Transform in SQL itself has several benefits, including direct execution of the DFT in the database system, which can be reused for several, different content processing steps from feature extraction to query transformation and evaluation. Expand
Concurrency in Main-Memory Database Systems
TLDR
This thesis determines the optimal mechanism for creating snapshots in main-memory systems which can be used to execute OLAP queries and introduces tentative execution, a mechanism which allows long-running transactions to be efficiently executed. Expand
...
1
2
...

References

SHOWING 1-9 OF 9 REFERENCES
MonetDB/X100: Hyper-Pipelining Query Execution
TLDR
An in-depth investigation to the reason why database systems tend to achieve only low IPC on modern CPUs in compute-intensive application areas, and a new set of guidelines for designing a query processor for the MonetDB system that follows these guidelines. Expand
Balancing vectorized query execution with bandwidth-optimized storage
TLDR
A new database system architecture is presented, realized in the MonetDB/X100 prototype, that combines a coherent set of new architecture-conscious techniques that are designed to work well together and achieves in-memory performance often one or two orders of magnitude higher than the existing approaches. Expand
Integration of vectorwise with ingres
TLDR
The integration of the VectorWise technology with Ingres, some of the design decisions made as part of the integration project, and the problems that had to be solved in the process are described. Expand
The INGRES Papers: Anatomy of a Relational Database System
TLDR
When you read more every page of this the ingres papers anatomy of a relational database system, what you will obtain is something great. Expand
Positional update handling in column stores
TLDR
A new data structure for maintaining such positional updates to columnar databases, called the Positional Delta Tree (PDT), is described, and detailed algorithms for PDT/column merging, updating PDTs, and for using PDTs in transaction management are described. Expand
Super-Scalar RAM-CPU Cache Compression
TLDR
This work proposes three new versatile compression schemes (PDICT, PFOR, and PFOR-DELTA) that are specifically designed to extract maximum IPC from modern CPUs and compares these algorithms with compression techniques used in (commercial) database and information retrieval systems. Expand
Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS
TLDR
This paper analyzes the performance of concurrent (index) scan operations in both record (NSM/PAX) and column (DSM) disk storage models and proposes the Cooperative Scans framework that enhances performance in such scenarios by improving data-sharing between concurrent scans. Expand
Integration of VectorWise with Ingres. SIGMOD Record
  • Integration of VectorWise with Ingres. SIGMOD Record
  • 2011
, and Peter Boncz . Integration of VectorWise with Ingres
  • SIGMOD Record The INGRES Papers : Anatomy of a Relational Database System .
  • 1986