Skip to content

Quantization

Quantization techniques for converting floating-point weights to lower precision.

  • INT8 Quantization
  • Mixed Precision
  • Post-Training Quantization (PTQ)
  • Dynamic vs Static Quantization
  • Quantization-Aware Training (QAT)
  • Impact on inference speed