Learning curves theory for hierarchically compositional data with power-law distributed features | Read Paper on Bytez