b
Discover
Models
Search
About
Theoretical guarantees on the best-of-n alignment policy
11 months ago
·
arXiv