SWAT: Scalable and Efficient Window Attention-based Transformers Acceleration on FPGAs

Devs

SWAT: Scalable and Efficient Window Attention-based Transformers Acceleration on FPGAs | Read Paper on Bytez