Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers | Read Paper on Bytez