1 points | by dev-on-bike 13 hours ago ago
2 comments
For me, the interesting part here isn't GPT-2, it's the memory discipline. I feel like most inference runtimes slowly leak allocations everywhere as features pile up.
[dead]
For me, the interesting part here isn't GPT-2, it's the memory discipline. I feel like most inference runtimes slowly leak allocations everywhere as features pile up.
[dead]