Quantization
Quantization techniques for converting floating-point weights to lower precision.
Key Concepts
Section titled “Key Concepts”- INT8 Quantization
- Mixed Precision
- Post-Training Quantization (PTQ)
To Explore
Section titled “To Explore”- Dynamic vs Static Quantization
- Quantization-Aware Training (QAT)
- Impact on inference speed