MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers | Read Paper on Bytez