LTD-Bench: Evaluating Large Language Models by Letting Them Draw | Read Paper on Bytez