Media Summary: SmoothQuant - Accurate and Efficient Post-Training Quantization for Large Language Models Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quantization can reduce ... In this video, we look into SmoothQ Algorithm and Paper: Paper: Pseudocode Open Source ...
Smoothquant - Detailed Analysis & Overview
SmoothQuant - Accurate and Efficient Post-Training Quantization for Large Language Models Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quantization can reduce ... In this video, we look into SmoothQ Algorithm and Paper: Paper: Pseudocode Open Source ... Seminar date : 2024.07.05 # Seminar contents Paper Review Seminar # Paper Title Xiao, Guangxuan, et al. " 00:00 Introduction to LLM Quantization 02:15 What is Quantization? 04:45 Post-Training Quantization (PTQ) vs. QAT 07:30 GPTQ ... Pseudo-lab (-lab ) EfficientLLM study Presenter: 김승우 Date: 2025/09/30 Paper:
Quantization is an excellent technique to compress Large Language Models (LLM) and accelerate their inference. In this video ...