Cross-process zero-copy NVMM IPC on Jetson — dma-buf fd passing, NvBufSurfaceImport, lock-free pool

Cross-Process Zero-Copy on Jetson: dma-buf fds, NvBufSurfaceImport, and a Cache-Line-Padded Pool

Two processes on a Jetson, one camera frame in NVMM (GPU memory), no copies. The kernel does the heavy lifting via dma-buf fds; SCM_RIGHTS carries the fd across the process boundary; NvBufSurfaceImport reconstructs the surface on the consumer side; a cache-line-padded ring of atomic ref-counts keeps fan-out coherent without locks. With benchmark numbers and a Godbolt-runnable demo of the SCM_RIGHTS pattern.

April 25, 2026 · 22 min · Pavel Guzenfeld
O3DE multi-camera rendering performance analysis

Chasing 18 Milliseconds: A Performance Deep Dive into O3DE's Render Readback Pipeline

We spent a full session systematically profiling O3DE’s multi-camera streaming pipeline, testing eight different optimization approaches, and pinpointed the exact bottleneck: 18 ms of fixed overhead in the AttachmentReadback scope system. Here’s what we tried, what we measured, and what it means for the engine.

April 17, 2026 · 7 min · Pavel Guzenfeld
Three live Godot camera streams over RTP/UDP rendered by GStreamer clients

From Unity to Godot: Multi-Camera Streaming at 50 FPS with Async GPU Readback

After O3DE’s 18 ms frame-graph readback made 30 FPS streaming impossible, we tried Godot. It got us there — eventually. This is the full path from 105 FPS on nothing to 50 FPS per camera with three live RTP streams, including every wrong turn and every underdocumented Godot behavior we hit on the way.

April 17, 2026 · 12 min · Pavel Guzenfeld
O3DE rendering a ground plane from a camera spawned programmatically inside a headless Docker container

From Unity to O3DE: Multi-Camera Streaming at 1080p in a Headless Docker Container

Exploring whether O3DE can replace Unity as the render engine for a drone simulation that streams multiple 1080p camera feeds via GStreamer. From first scaffold to three live RenderToTexture pipelines in a single session.

April 16, 2026 · 6 min · Pavel Guzenfeld
FPS optimization pipeline for headless Unity simulation

From 21 to 25 FPS: Profiling and Optimizing a Headless Unity Simulation Pipeline

A detailed technical walkthrough of profiling and optimizing a headless Unity simulation from 21 to 25 FPS — covering NVENC GPU encoding, batch camera rendering, GPU instancing, static batching, and the failed URP migration. Every measurement, every dead end, every lesson.

April 4, 2026 · 8 min · Pavel Guzenfeld