b
Discover
Models
Search
About
Sequential Preference Ranking for Efficient Reinforcement Learning from Human Feedback
2023
·
NeurIPS