VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference
Conference
Zihan Liu, Xinhao Luo, Junxian Guo, Wentao Ni, Yangjie Zhou, Yue Guan, Cong Guo, Weihao Cui, Yu Feng, Minyi Guo, Yuhao Zhu, Minjia Zhang, Chen Jin, Jingwen Leng.
1 March 2025
HPCA