Posts

Diagram of a multi‑level LSM tree with compaction arrows.

Optimizing Log-Structured Merge Trees for Write-Intensive Distributed Databases

A deep dive into LSM tree internals for write‑heavy clusters, with real‑world patterns from RocksDB, Cassandra, and ScyllaDB.

RocksDB compaction diagram on a server rack

Optimizing LSM-Tree Compaction in RocksDB: A Deep Dive into Write Amplification and Performance Tuning

A practical guide to reducing RocksDB write amplification through compaction tuning, with concrete configuration patterns and real‑world performance data.

Illustration of memory arenas and thread caches in a multi‑core server.

Deep Dive into jemalloc Arenas and Thread Caches: Architecture, Scalability, and Memory Management Patterns

A technical walkthrough of jemalloc’s arena and thread‑cache subsystems, showing how they achieve low contention and high throughput in real‑world services.

Diagram of tri‑color marking stages overlaid on a memory heap.

Implementing Concurrent Garbage Collection: Deep Dive into Tri-Color Marking for Low-Latency Memory Management

A practical guide to building a concurrent garbage collector using tri‑color marking, covering core invariants, integration with JVM and Go runtimes, and real‑world performance tuning.

Implementing WebGPU-Accelerated Quantization for Local Llama Inference: A Deep Dive into Browser-Based Performance

A step‑by‑step guide that shows engineers how to combine WebGPU with weight quantization to run Llama locally, complete with code snippets and production‑grade patterns.