Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale | Read Paper on Bytez