Sliding Window Join

Window joins and window functions are implemented by Flink, Spark, and other big data tools. I give an use-case involving 2 streams “metrics” and “events” below. In an ideal situation, machine-generated data would automatically alert the right set of people when a problem occurs. To do so, we need to do various “aggregations” and “joins” to connect a metric to the relevant employee. Window computations can make big data pipelines CPU bound or memory bound, based on my experience. I do a quick investigation of why that is the case here. ...

August 21, 2024

JSON in Columnar Storage

Here is a ~100LOC implementation of support for storing JSON as columns. This is one of the small projects I did while learning Rust. This implementation is a simpler, unoptimized version based on the Snowflake Data Warehouse paper. In certain real-world cases, both insert performance and query performance on JSON data can be similar to inserting and querying int64/float64. The C++ simdjson package supports >1GB/s/core JSON parsing. The Rust package used here should have similar performance. So, in cloud environments where we persist data to block or object storage, network bandwidth may be the bottleneck for writes, not CPU. ...

August 20, 2024

Dao De Jing 18

大道废,有仁义, 智慧出,有大伪, 六亲不和,有孝慈, 国家混乱,有忠臣。 ‍ da dao fei, you ren yi. zhi hui chu, you da wei. liu qin bu he, you xiao ci. guo jia hun luan, you zhong cheng. ‍ だいどうほ、ゆうじんぎ、 ちえしゅつ、ゆうだいぎ、 ろくしんふわ、ゆうこうじ、 こっかこんらん、ゆうちゅうしん。 ‍ Dại dạo phế, hữu nhân nghĩa. trí hue xuất, hữu dại nguy. lục thân bất hoà, hữu hiếu từ. quốc gia hộn loạn, hữu trung thần. ‍ 如果大道不废,那无仁义。有没有办法预防大道的废? 如果智慧不出,那无大伪。 如果六亲和,那无孝慈。有没有办法预防六亲不和? 国家不混乱,那无忠臣,只有臣,也许臣也没有。 也许简化:两个人,一个事物,就有竞争。 减少竞争有用,可是也有副作用。 ‍ 仁义,智慧,孝慈,国家,忠臣。这都是古代基本的概念。 如果基本的问题被解决,那这些基本概念还要知道吗?

August 19, 2024

Expectation while Learning Rust

Writing (code) in an unfamiliar language feels a bit weird. By adjusting one’s mindset and expectations, one can reduce the weirdness. Here are some expectations I use to have a good time while coding: Sometimes you can code with self-control, other times you can only code by instinct. Human-generated code will always have defects. Compilers, tests, code reviews, and users can catch those defects. The “users” of code are anybody who interact with it. The most common “users”" are customers (of the software product), developers, QA, and SREs. Typing code is not always the bottleneck, thinking is often the bottleneck. In some cases, waiting on compilation or test cases is the bottleneck. Code quality is not straightforward. When writing personal code, “quality” is highly related to the rate at which I learn. Even if the code does not work, if it helps me learn more quickly, then it is useful. Rust provides some “safe defaults”. You need to write let mut in order to experiment with mutability. You need to include the keyword unsafe to do pointer arithmetic. ...

August 17, 2024

Query Compiler Paper Notes

This is my first non-industry paper notes. I will be investigating query execution. https://www.vldb.org/pvldb/vol4/p539-neumann.pdf 1 Introduction Databases typically turn a query into an algebra (plan tree). Then the plan is executed using an iterator model (Volcano-style). This interface is not as CPU-efficient as it could be. The next() function of the iterator is called millions of times, function calls are virtual calls that break branch prediction, and the code has poor code locality. ...

August 17, 2024

Dao De Jing 1

道可道,非常道, 名可名,非常名。 无名,万物之始, 有名,万物之母。 故常无欲,以观其妙, 常有欲,以观其徼。 两者同出,异名同谓, 玄之又之,众眇之门。 ‍ dao ke dao, fei chang dao, ming ke ming, fei chang ming. you ming, wan wu zhi shi. wu ming, wan wu zhi mu. gu chang wu yu, yi guan qi miao. chang you yu, yi guan qi jiao. liang zhe tong chu, yi ming tong wei, xuan zhi you zhi, zhong miao zhi men. ‍ Dạo khả dạo, phi thường dạo, danh khả danh, phi thường danh. ...

August 9, 2024

Pure Storage Paper Notes

https://crss.us/media/pubs/1f0c405f9fa2cc9de23a45710fa85b9e7330a958.pdf Pure Storage is another company with an industry paper. Pure Storage has a market cap of ~$20B USD. Abstract The paper describes the foundation of an all-flash enterprise storage system (Purity). Purity uses SSDs, then adds deduplication and data compression on top of it. 1 Introduction Flash (SSDs) is becoming cheaper. However, enterprise users have higher resiiency, availiability , and cost requirements. Purity is an enterprise storage system. Data is stored in “mediums”, a virtual/logical mapping abstraction. System is designed around fast random reads and sequential writes. Note this is different than HDDs, which prefer both sequential reads and writes. The author’s claim they have cost parity with disk. Purity uses log-structured indexes and data layouts. They also use striping and Reed-Solomon erasure encoding. Controller high availability 5.4x data reduction 2 Background SSDs: ...

August 8, 2024

Snowflake Paper Notes

My first few paper summaries will be on industry papers. The first one will be on a product that I used at work, and a product behind a company worth ~$40B. https://event.cwi.nl/lsde/papers/p215-dageville-snowflake.pdf Bolded terms are terms I am unfamiliar with, and may review later on. 1 Introduction Snowflake is a data warehouse. As discussed later, the user interface is SQL, used through a database driver or through the UI. If the reader is unfamiliar with data warehousing, they may find the related work section relevant. Snowflake’s competition at the time included Redshift, BigQuery, and Hadoop. ...

August 5, 2024

Tidb Paper Notes

The TiDB Paper is an industry paper with authors from a Chinese background. Unlike Snowflake, the company behind TiDB has not gone public. https://www.vldb.org/pvldb/vol13/p3072-huang.pdf 1 Introduction TiDB是一个HTAP数据库。所以,TiDB可以处理transactional queries和analytical queries。 TiDB类似于SQL RDBMS。用户用SQL(use SQL driver,写SQL DML,等)。 The authors mention the following contributions: 他们已经建造一个Raft-based HTAP 数据库。这数据库提供high availability,consistency,scalability,data freshness,and isolation。就是说,这系统的availability,consistency,等有good performance。 Raft learner role multi-Raft storage system 新SQL engine。这SQL engine可以选column store table scan或 row-based store scan/index。 Benchmarks using CH-benCHmark 2 Raft-Based HTAP Data is stored in multiple Raft groups. ...

August 4, 2024