Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming | Read Paper on Bytez