b
Discover
Models
Search
About
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
2023
·
NeurIPS