Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-video Emotion Recognition | Read Paper on Bytez