Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP | Read Paper on Bytez