Super HN - Super Hacker News

Does RL Incentivize Reasoning in LLMs Beyond the Base Model? (limit-of-rlvr.github.io) 1 point by leodriesch 0 minutes ago