Query Compiler Paper Notes

This is my first non-industry paper notes.

I will be investigating query execution.

Databases typically turn a query into an algebra (plan tree). Then the plan is executed using an iterator model (Volcano-style).
This interface is not as CPU-efficient as it could be. The next() function of the iterator is called millions of times, function calls are virtual calls that break branch prediction, and the code has poor code locality.
Better CPU efficiency can be created by doing block-oriented processing, but eliminates pipelining, leading to increased memory bandwidth.
Proposed query compilation strategy:
- Processing is focused on data, on keeping data in CPU registers.
- Data is pushed, not pulled.
- Queries are compiled using LLVM.

Generating Machine Code
- C++ unsuitable b/c slow and suboptimal.
- Use LLVM.
- LLVM better than assembly:
  - Strongly typed
  - Unbounded number of registers
- Write some parts in C++ and other parts in LLVM (tuple access and processing).
- Hot path is in LLVM.
Complex Operators
- Complex operators like joins and sort make it better to generate multiple LLVM functions.
- Materialization is avoided (because it is leads to complicated code)
Performance tuning
- Hashing and branching become bottlenecks.
- Branch structure improves hash table lookups by >20%

Quad-Core CPU w/ 64GB of main memory.
Systems Comparison
- Use TPC-CH
- Table 1:
  - Query compilation w/ C++ instead of LLVM takes seconds instead of ms.
  - Query execution time is slightly faster than VectorWise.
Code quality
- Evaluating branch and cache effects.
- Use calgrind.
- Far fewer level 1 instruction cache misses than MonetDB (1000x less).
- Similar number of branch mispredicts (within 0.1-10x)
  
  At least 5x fewer cache misses at level 1 and level 2.