b
Discover
Models
Search
About
Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach
7 months ago
·
arXiv