A Theoretical Study of (Hyper) Self-Attention through the Lens of Interactions: Representation, Training, Generalization | Read Paper on Bytez