Exploration Hacking: Can LLMs Learn to Resist RL Training?

(alignmentforum.org)

2 points | by Prof_Sigmund 7 hours ago ago

No comments yet.