SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling | Read Paper on Bytez