Provable Scaling Laws of Feature Emergence from Learning Dynamics of Grokking

(arxiv.org)

2 points | by sva_ 2 hours ago ago

No comments yet.