Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention | Read Paper on Bytez