b
Discover
Models
Search
About
Group Robust Preference Optimization in Reward-free RLHF
1 week ago
·
NeurIPS