Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models | Read Paper on Bytez