Is One Layer Enough? A Single Transformer Layer Matches Full-Parameter RL Train

(arxiv.org)

99 points | by tcp_handshaker 6 hours ago ago

23 comments