How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model | Read Paper on Bytez