VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference

Conference
Authors

Zihan Liu, Xinhao Luo, Junxian Guo, Wentao Ni, Yangjie Zhou, Yue Guan, Cong Guo, Weihao Cui, Yu Feng, Minyi Guo, Yuhao Zhu, Minjia Zhang, Chen Jin, Jingwen Leng.

Published

1 March 2025

Publication details

HPCA

Links