DB2 with BLU Acceleration: So Much More than Just a Column Store
@article{Raman2013DB2WB, title={DB2 with BLU Acceleration: So Much More than Just a Column Store}, author={Vijayshankar Raman and Gopi K. Attaluri and Ronald Barber and Naresh Chainani and David Kalmuk and Vincent KulandaiSamy and Jens Leenstra and Sam Lightstone and Shaorong Liu and Guy M. Lohman and Timothy Malkemus and Ren{\'e} M{\"u}ller and Ippokratis Pandis and Berni Schiefer and David Sharpe and Richard Sidle and Adam J. Storm and Liping Zhang}, journal={Proc. VLDB Endow.}, year={2013}, volume={6}, pages={1080-1091} }
DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just a column store. Exploiting frequency-based dictionary compression and main-memory query processing…
Figures and Tables from this paper
245 Citations
In-memory BLU acceleration in IBM's DB2 and dashDB: Optimized for modern workloads and hardware architectures
- Computer Science2015 IEEE 31st International Conference on Data Engineering
- 2015
In-memory BLU Acceleration used in IBM's DB2 for Linux, UNIX, and Windows, and now also the dashDB cloud offering, is presented, which was designed and implemented from the ground up to exploit main memory but is not limited to what fits in memory and does not require manual management of what to retain in memory.
Vectorized Bloom filters for advanced SIMD processors
- Computer ScienceDaMoN '14
- 2014
This work introduces a vectorized implementation for probing Bloom filters based on gathers that eliminates conditional control flow and is independent of the SIMD length, indicating a significant performance improvement over scalar code that can exceed 3X when the Bloom filter is cache-resident.
CORES: Towards Scan-Optimized Columnar Storage for Nested Records
- Computer ScienceACM Trans. Storage
- 2019
This work presents CORES (Column-Oriented Regeneration Embedding Scheme), a design to push highly selective filters down into column-based storage engines, where each filter consists of several filtering conditions on a field, and applies this design to the nested relational model.
VIP: A SIMD vectorized analytical query engine
- Computer ScienceThe VLDB Journal
- 2020
VIP, an analytical query engine designed and built bottom up from pre-compiled column-oriented data parallel sub-operators and implemented entirely in SIMD is introduced and it is shown that VIP outperforms hand-optimized query-specific code without incurring the runtime compilation overhead.
VIP: A SIMD vectorized analytical query engine
- Computer ScienceThe VLDB Journal
- 2020
VIP, an analytical query engine designed and built bottom up from pre-compiled column-oriented data parallel sub-operators and implemented entirely in SIMD is introduced and it is shown that VIP outperforms hand-optimized query-specific code without incurring the runtime compilation overhead.
Towards a Hybrid Design for Fast Query Processing in DB2 with BLU Acceleration Using Graphical Processing Units: A Technology Demonstration
- Computer ScienceSIGMOD Conference
- 2016
This paper shows how to use Nvidia GPUs and host CPU cores for faster query processing in a DB2 database using BLU Acceleration (DB2's column store technology), and uses a dynamic design that can make use of optimizer metadata to intelligently choose a GPU kernel to run.
Rethinking SIMD Vectorization for In-Memory Databases
- Computer ScienceSIGMOD Conference
- 2015
This paper presents novel vectorized designs and implementations of database operators, based on advanced SIMD operations, such as gathers and scatters, and highlights the impact of efficient vectorization on the algorithmic design of in-memorydatabase operators, as well as the architectural design and power efficiency of hardware.
Optimistically Compressed Hash Tables & Strings in theUSSR
- Computer ScienceSIGMOD Rec.
- 2021
This work proposes three complementary techniques to improve this representation of hash tables, including Domain-Guided Prefix Suppression bit-packs keys and values tightly to reduce hash table record width, and Optimistic Splitting, which decomposes values into frequently- and infrequently-accessed value slices.
Upscaledb: Efficient integer-key compression in a key-value store using SIMD instructions
- Computer ScienceInf. Syst.
- 2017
Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation
- Computer ScienceSIGMOD Conference
- 2016
Experimental evaluation of HyPer, the full-fledged hybrid OLTP & OLAP database system, shows that Data Blocks accelerate performance on a variety of query workloads while retaining high transaction throughput.
24 References
Blink: Not Your Father's Database!
- Computer ScienceBIRTE
- 2011
The Blink project’s ambitious goals are to answer all Business Intelligence (BI) queries in mere seconds, regardless of the database size, with an extremely low total cost of ownership, and the next generation of Blink, called BLink Ultra, or BLU, which will significantly expand the “sweet spot” of Blink technology to much larger, disk-based warehouses and allow BLU to “own” the data, rather than copies of it.
Business Analytics in (a) Blink
- Computer ScienceIEEE Data Eng. Bull.
- 2012
The Blink project is working on the next generation of Blink, which will expand the “sweet spot” of the Blink technology to much larger, disk-based warehouses and allow Blink to “own” the data, rather than copies of it.
How to barter bits for chronons: compression and bandwidth trade offs for database scans
- Computer ScienceSIGMOD '07
- 2007
A study of how to make table scans faster by the use of a scan code generator that produces code tuned to the database schema, the compression dictionaries, the queries being evaluated and the target CPU architecture is presented.
SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units
- Computer ScienceProc. VLDB Endow.
- 2009
This paper shows that utilizing the embedded Vector Processing Units (VPUs) found in standard superscalar processors can speed up the performance of mainmemory full table scan by factors without changing the hardware architecture and thereby without additional power consumption.
Row-wise parallel predicate evaluation
- Computer ScienceProc. VLDB Endow.
- 2008
This paper proposes a new layout and processing technique for efficient one-pass predicate evaluation, using a novel evaluation strategy that evaluates column level equality, range tests, IN-list predicates, and conjuncts of these predicates simultaneously on multiple columns within a bank, and on multiple rows within a machine register.
Breaking the memory wall in MonetDB
- Computer ScienceCACM
- 2008
This paper reports how research around the MonetDB database system has led to a redesign of database architecture in order to take advantage of modern hardware, and in particular to avoid hitting the memory wall.
C-Store: A Column-oriented DBMS
- Computer ScienceVLDB
- 2005
Preliminary performance data on a subset of TPC-H is presented and it is shown that the system the team is building, C-Store, is substantially faster than popular commercial products.
Weaving Relations for Cache Performance
- Computer ScienceVLDB
- 2001
This paper proposes a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page, and demonstrates that in-page data placement is the key to high cache performance.
Constant-Time Query Processing
- Computer Science2008 IEEE 24th International Conference on Data Engineering
- 2008
Blink is presented, the first attempt at this goal, that runs every query as a table scan over a fully denormalized database, with hash group-by done along the way, and a scheme for evaluating a conjunction of range and equality predicates in SIMD fashion over compressed tuples.
Materialization Strategies in a Column-Oriented DBMS
- Computer Science2007 IEEE 23rd International Conference on Data Engineering
- 2007
This paper describes a variety of strategies for tuple construction and intermediate result representations and provides a systematic evaluation of these strategies.