Publications
Curriculum Vitae Google Scholar
Papers
- Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo .( SoCC 2023 ). pdf
- Improving Cluster Utilization through Adaptive Resource Management for DNN and CPU Jobs Co-location .( TC 2023 ). pdf
- Optimizing Dynamic Neural Networks with Brainstorm .( OSDI 2023 ). pdf
- AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs .( CF 2023 ). pdf
- ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-grained Resource Management .( TC 2022 ). pdf
- DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs .( ATC 2022 ). pdf
- PAME: Precision-Aware Multi-Exit DNN Serving for Reducing Latencies of Batched Inferences .( ICS 2022 ). pdf
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS .( HPCA 2022 ). pdf
- Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction .( SC 2021 ). pdf
- Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks .( ICCD 2021 ). pdf
- Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs .( TC 2021 ). pdf
- E2bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services .( TPDS 2020 ). pdf
- CODA: Improving Resource Utilization by Slimming and Co-locating DNN and CPU Jobs .( ICDCS 2020 ). pdf
- Ebird: Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services .( ICCD 2019 ). pdf
- Laius: Towards Latency Awareness and Improved Utilization of Spatial Multitasking Accelerators in Datacenters .( ICS 2019 ). pdf
No matching items