FPS optimization pipeline for headless Unity simulation

From 21 to 25 FPS: Profiling and Optimizing a Headless Unity Simulation Pipeline

A detailed technical walkthrough of profiling and optimizing a headless Unity simulation from 21 to 25 FPS — covering NVENC GPU encoding, batch camera rendering, GPU instancing, static batching, and the failed URP migration. Every measurement, every dead end, every lesson.

April 4, 2026 · 8 min · Pavel Guzenfeld
Fixing O(N²) Entity Addition in ROS 2's CallbackGroup

Fixing O(N²) Entity Addition in ROS 2's CallbackGroup

How a simple erase-remove in every add_timer() call turned entity registration into a quadratic bottleneck — and the 71x speedup from moving cleanup to the right place.

March 23, 2026 · 4 min · Pavel Guzenfeld
Upgrading Householder Right-Side: BLAS-2 to BLAS-3

Upgrading Eigen's Householder Right-Side Application from BLAS-2 to BLAS-3

Eigen’s blocked Householder path only existed for left-side application. I added the right-side equivalent, upgrading M*Q from O(n) rank-1 updates to cache-friendly blocked matrix multiplies.

March 23, 2026 · 4 min · Pavel Guzenfeld
How GCC's fill_n Regressed Eigen's AutoDiffScalar

How GCC's std::fill_n Silently Regressed Eigen's AutoDiffScalar Performance

A performance optimization in Eigen’s fill path assumed all scalar types are equal. GCC’s libstdc++ disagreed — and AutoDiffScalar paid the price.

March 20, 2026 · 4 min · Pavel Guzenfeld