Professor
School of Computer Science and Engineering
Sino-German Joint Software Institue (JSI)
Beihang University
I am a Professor in School of Computer Science and Engineering at Beihang University. I received B.S and Ph.D degrees under supervision of Prof. Depei Qian. I was also a Post-doc researcher in Department of Computer Science and Engineering at University of Michigan. My research interests include high performance computing, performance analysis and optimization, deep learning system and compilation, parallel and distributed computing. My recent research investigates a holistic approach of across-stack optimization for high performance, high scalability and high portability, with special interests in large scale elastic training system, deep learning compilation and auto-tuning techniques, sparse tensor optimization, exascale performance analysis tool and high performance linear algebra for emerging processors. I have authored over 80 scientific publications in the leading international journals and conferences. I received the Excellence Teaching Award from Beihang University in 2016.
I served as the committee member of CCF Doctoral Dissertation Incentive Program, Youth Editorial Board of the CCF Transactions on High Performance Computing (CCF THPC). I was the architecture area program co-chair of 23rd IEEE International Conference on Cluster Computing (CLUSTER), 2021. I currently serve as reviewers in the premier journals including TPDS、TC、PARCO、JPDC、FGCS、FCS. I am also the supervisor of Beihang Supercomputing Team, which has won the Silver Prize of ASC’17, Bronze Prize of ISC’17, Highest Linpack Award, Application Innovation Award and First Class Award of ASC competitions.
🔥 News
- 🔥 August 2024: One paper (RecServ) is accepted to TC. Congratulations to Xin You.
- 🔥 June 2024: Two papers (Moirae and GVARP) are accepted to SC. Congratulations to Xiaoyan Liu and Xin You.
- 🔥 June 2024: Two papers (PRoof and Jigsaw) are accepted to ICPP. Congratulations to Siyu Wu and Kaige Zhang.
- 🔥 March 2024: One paper (AtRec) is accepted to TPDS. Congratulations to Siqi Wang and Tianyu Feng.
- 🔥 March 2024: Our paper Tetris is selected as the best paper candidate (three papers in total) in PPoPP 2024. Congratulations to Xiaoyan Liu.
- 🔥 November 2023: One paper (Tetris) is accepted to PPoPP (best paper candidate). Congratulations to Xiaoyan Liu.
- 🔥 October 2023: One paper (GSTuner) is accepted to TPDS. Congratulations to Qingxiao Sun.
- 🔥 June 2023: Two papers (EasyScale and TrivialSpy) are accepted to SC. Congratulations to Mingzhen Li and Xin You.
- 🔥 June 2023: One paper (FamilySeer) is accepted to ICPP. Congratulations to Mingzhen Li.
- 🔥 May 2023: One TC paper is selected as IEEE Computer’s “Spotlight on Transactions”. Congratulations to Qingxiao Sun.
- 🔥 April 2023: One paper (BiRFIA) is accepted to ICS. Congratulations to Kelun Lei.
- 🔥 March 2023: One paper (swLego) is accepted to SCIS. Congratulations to Mingzhen Li.
- 🔥 December 2022: One paper (Mimose) is accepted to IPDPS’23. Congratulations to Jianjin Liao and Mingzhen Li.
- 🔥 December 2022: Invited to give a talk at High Performance Sparse Matrix Computing Architecture, Algorithm and Application Symposium of CCF HPCChina on December 15th.
- 🔥 December 2022: Invited to give a talk at High Performance Computer Programming Model and Optimization Symposium of CCF HPCChina on December 15th.
- 🔥 December 2022: Invited to give a talk at High Performance Computing Middleware Symposium of CCF HPCChina on December 12th.
- 🔥 November 2022: One paper (swTVM) is accepted to FCS. Congratulations to Mingzhen Li.
- 🔥 September 2022: One paper (VClinic) is accepted to ASPLOS’23. Congratulations to Xin You.
- 🔥 August 2022: Invited to give a talk at Sensetime Beijing on August 25th.
- 🔥 August 2022: Invited to give a talk at CAE on August 17th.
- 🔥 July 2022: The student supercomputing team won the second prize of SolverChallenge22 on July 21th.
- 🔥 July 2022: Invited to give a talk at HPC Symposium of Open Scientific Computing Consortium on July 9th.
- 🔥 July 2022: One paper is accepted to PARCO. Congratulations to Qingxiao Sun.
- 🔥 June 2022: One paper (CoGNN) is accepted to SC’22. Congratulations to Qingxiao Sun.
- 🔥 June 2022: Two papers (NNLQP and DynVec) are accepted to ICPP’22. Congratulations to Xin You and Sensetime co-authors.
- 🔥 April 2022: One paper (TCstencil) is accepted to ICS’22. Congratulations to Xiaoyan Liu.
- November 2021: Two papers (StencilMART and PowerSpector) are accepted to IPDPS’22. Congratulations to Qingxiao Sun and Xin You.
- October 2021: Invited to give a talk at High Performance Computer Programming Model and Optimization Symposium of CCF HPCChina on October 23th.
- October 2021: Invited to give a talk at High Performance Environment Maintainance and Optimization Symposium of CCF HPCChina on October 22th.
- September 2021: One paper is accepted to TC. Congratulations to Qingxiao Sun.
- July 2021: One paper (csTuner) is accepted to CLUSTER’21. Congratulations to Qingxiao Sun.
- June 2021: One paper (MSC) is accepted to ICPP’21. Congratulations to Mingzhen Li.
- April 2021: One paper (cuTC) is accepted to ICS’21. Congratulations to Ming Dun.
- February 2021: Invited to give a keynote speach at the Twelfth International Workshop on Programming Models and Applications for Multicores and Manycores on February 27th.
- October 2020: Invited to give a talk at Huawei Computing Technique Symposium of CCF CNCC on October 22th.
- September 2020: Invited to give a talk at High Performance Sparse Computation Architecture, Algorithm and Application Symposium of CCF HPCChina on September 9th.
- June 2020: Two papers (SpTFS and Zerospy) are accepted to SC’20. Congratulations to Xin You and Qingxiao Sun.
📝 Selected Publications
- 🔥 Exploiting Structured Feature and Runtime Isolation for High-Performant Recommendation Serving (TC) 2024.
- 🔥 GVARP: Detecting Performance Variance on Large-Scale Heterogeneous System (SC) 2024.
- 🔥 Moirae: Generating High-Performance Composite Stencil Programs with Global Optimizations (SC) 2024.
- 🔥 PRoof: A Comprehensive Hierarchical Profiling Framework for Deep Neural Networks with Roofline Analysis (ICPP) 2024.
- 🔥 Jigsaw: Accelerating SpMM with Vector Sparsity on Sparse Tensor Core (ICPP) 2024.
- 🔥 AtRec: Accelerating Recommendation Model Training on CPUs (TPDS) 2024.
- 🔥 Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU (PPoPP, best paper candidate) 2024.
- 🔥 Adaptive Auto-tuning Framework for Global Exploration of Stencil Optimization on GPUs (TPDS) 2023.
- 🔥 EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs (SC) 2023.
- 🔥 TrivialSpy: Identifying Software Triviality via Fine-grained and Dataflow-based Value Profiling (SC) 2023.
- 🔥 Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs (ICPP) 2023.
- 🔥 BiRFIA: Selective Binary Rewriting for Function Interception on ARM (ICS) 2023.
- 🔥 Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU (IPDPS) 2023.
- 🔥 VClinic: A Portable and Efficient Framework for Fine-grained Value Profilers (ASPLOS) 2023.
- 🔥 Building a Domain-Specific Compiler for Emerging Processors with a Reusable Approach (SCIS) 2023.
- 🔥 Towards Optimized Tensor Code Generation for Deep Learning on Sunway Many-Core Processor (FCS) 2022.
- 🔥 CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs (SC) 2022.
- 🔥 Vectorizing SpMV by Exploiting Dynamic Regular Patterns (ICPP) 2022.
- 🔥 NNLQP: A Multi-Platform Neural Network Latency Query and Prediction System with An Evolving Database (ICPP) 2022.
- 🔥 Toward accelerated stencil computation by adapting tensor core unit on GPU (ICS) 2022.
- 🔥 StencilMART: Predicting Optimization Selection for Stencil Computations across GPUs (IPDPS) 2022.
- 🔥 PowerSpector: Towards Energy Efficiency with Calling-Context-Aware Profiling (IPDPS) 2022.
- Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP (TC) 2021.
- The Deep Learning Compiler: A Comprehensive Survey (TPDS) 2021.
- Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee (TOCS) 2021.
- SpTFS: Sparse Tensor Format Selection for MTTKRP via Deep Learning (SC) 2020.
- ZeroSpy: Exploring Software Inefficiency with Redundant Zeros (SC) 2020.
- SympleGraph: Distributed Graph Processing with Precise Loop-Carried Dependency Guarantee (PLDI) 2020.
- Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture (TPDS) 2020.
- Massively Scaling Seismic Processing on Sunway TaihuLight Supercomputer (TPDS) 2020.
- Temperature-Aware DRAM Cache Management - Relaxing Thermal Constraints in 3-D Systems (TCAD) 2020.
- Redundant Loads: A Software Inefficiency Indicator (ICSE) 2019.
- LWPTool: A Lightweight Profiler to Guide Data Layout Optimization (TPDS) 2018.
- SMGuard: A Flexible and Fine-Grained Resource Management Framework for GPUs (TPDS) 2018.
- PowerChief: Intelligent Power Allocation for Multi-Stage Applications to Improve Responsiveness on Power Constrained CMP (ISCA) 2017.
- Prophet: Precise QoS Prediction on Non-Preemptive Accelerators to Improve Utilization in Warehouse-Scale Computers (ASPLOS) 2017.
- Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers (ASPLOS) 2016.
- Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers (ISCA) 2013.
🎖 Honors and Awards
- CCF HPC Talent Young Scientist Award, 2024.
- Best Paper Award Nomination, ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP), 2024. (4 nominations out of 153 submissions)
- Beihang University May 4th Medal Nomination Award, 2023.
- CCF HPCChina Workshop Distinguished Speaker 2021.
- Beihang Distinguished Young Scholar Award 2021.
- Best Paper Award Nomination, IEEE International Conference on Cluster Computing (CLUSTER), 2021. (2 nominations out of 168 submissions)
- CCF CNCC Workshop Distinguished Speaker 2020.
- CCF HPCChina Workshop Distinguished Speaker 2020.
- Best Paper Award, BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench), 2020.
- Beihang University Excellence Teaching Award in 2016.
💬 Teaching
- Methodology of Computer Science Research (Undergraduate Student)
- Parallel Programming (International Student)
- Computer Achitecture (International Student)