Q8BERT, a Quantized 8bit Version of BERT-Base – Intel AI

Q8BERT, a Quantized 8bit Version of BERT-Base – Intel AI

Q8BERT, a Quantized 8bit Version of BERT-Base – Intel AI

This work presents a method to achieve the best-in-class compression-accuracy ratio for BERT-base. We open sourced the quantization method and the code for reproducing the 8bit quantized models and have made it available in NLP Architect release 0.5.

Source: www.intel.ai/q8bert/

Subscribe to our Digest