b
Discover
Models
Search
About
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
2 weeks ago
·
NeurIPS