Should Under-parameterized Student Networks Copy or Average Teacher Weights? | Read Paper on Bytez