Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage | Read Paper on Bytez