Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning | Read Paper on Bytez