Birder: Communication-Efficient 1-bit Adaptive Optimizer for Practical Distributed DNN Training | Read Paper on Bytez