Transformer Key-Value Memories Are Nearly as Interpretable as Sparse Autoencoders

Devs

Transformer Key-Value Memories Are Nearly as Interpretable as Sparse Autoencoders | Read Paper on Bytez