OpenAI: Investigating the consequences of accidentally grading CoT during RL

(alignment.openai.com)

2 points | by pretext 11 hours ago ago

No comments yet.