b

DiscoverModelsSearch
About
Theoretical guarantees on the best-of-n alignment policy
11 months ago
·
arXiv