b
Discover
Models
Search
About
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
1 week ago
·
NeurIPS