Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
2019·Arxiv