Aodos: Affinity-aware Orchestration and Deterministic Operator Overlap for Simultaneous DNN Services in the GPU Cluster

Authors

Weihao Cui*, Chunyu Xue*, Han Zhao, Quan Chen, Minyi Guo.

Published

1 January 2024

Publication details

Working paper

Links