Benchmarking Large Language Models with Integer Sequence Generation Tasks | Read Paper on Bytez