We built a CUDA emulator that profiles GPU code with zero hardware

(rightnowai.co)

5 points | by rightnow_ai 12 hours ago ago

3 comments

rightnow_ai 12 hours ago ago
Quick context on what this actually does
This is not static analysis. It runs your CUDA kernel in a CPU-backed simulator and predicts how it behaves on real GPUs
Basicly it uses a tile model tied to L2 size and SM limits
Right now it covers 80+ NVIDIA architectures and the Mean error on exec time is around 1–2% on our test kernals that we made 'more info in the blog'
It still struggles with dynamic parallelism but I will figure it out soon
12 hours ago ago
[deleted]
12 hours ago ago
[deleted]