Super HN
New
Show
Reinforcement Learning from Human Feedback
(arxiv.org)
7 points by onurkanbkrc 46 minutes ago