Thursday, December 7, 2017



Dr. Ang Li is a chief computer scientist in the Physical and Computational Sciences  Directorate (PCSD) of Pacific Northwest National Laboratory (PNNL) and Associated Professor in the ECE department of University of Washington (UW) with dual appointment. He received his bachelor degree from Zhejiang University, China, in 2010, and two PhD degrees from ECE department of National University of Singapore (NUS) and EE department of Eindhoven University of Technology (TU/e), Netherlands, in 2016. He jointed PNNL since Nov, 2016 and dual-appointed with UW since Oct, 2023. His research has been focusing on software-hardware co-design for HPC, AI/ML and quantum computing. He has published in major HPC conferences and journals including SC, ICS, PPoPP, IPDPS, HPDC, ASPLOS, MICRO, HPCA, ICPP, CGO, IISWC, EuroPar, TPDS, TC, ICPE, etc. His lead-author work was nominated for best paper award in SC-15, SC-17, IISWC-18 and SC-20. He received the European HiPEAC paper award, and PNNL's PCSD Outstanding Performance award. He served as organizing committee or review committee member for major HPC conferences including PPoPP, SC, ASPLOS, PACT, ISCA, IPDPS, etc. He used to work in industry as a HPC application developer, where he led the evaluation, development, and optimization of several industrial HPC applications. He also worked as a research intern in the INRIA-Lab in Paris-Sud University, France and Chinese University of Hong Kong. His research interest includes quantum computing, software-hardware co-design for HPC accelerators, particularly GPUs, and domain-specific accelerators; performance modeling and evaluation for HPC architecture and applications; and binarized neural network.

Service (PC/ERC)

  • Conference: ISCA (25/24/23/22/21/20), MICRO (25/24/22/21), HPCA (25/24), ASPLOS (22), PLDI (25), SC (24/23/22/21), PPoPP (22/21/20/19), IPDPS (25/24/23/21), ICS (25/24), ICLR (25), CC (24/23/22), QCE (25/24/23), MLSys (23), ICPP (23), PACT (19), SBAC-PAD (23/22/21/20), Cluster (21), HiPC (22/21), HPCC (21/20), NPC (19/18)
  • Editing: TPDS (Associate Editor)
  • Journal Review: TPDS, TC, TOPC, CSUR, DB, JPDC, JSA, CAL, TACO, TNNLS, JCSC, TCAS-I/II, MICPRO, FGCS, TODAES, COSE, CVIU, Nature-NPJQI, SPE, TECS, IEEE Micro, Nature-ScientificReport, TQC, TQE, SNCS, Quantum, PRL

Team:

  • Current: 
    • Samuel Stein (staff scientist, quantum computing), Chenxu Liu (staff scientist, quantum physics), Muqing Zheng (postdoc, math & quantum), Sean Garner (PhD student, UW ECE), Meng Wang (PhD intern, UBC ECE), Nathan Myers (postdoc, physics, U Maryland),  Chunshu Wu (postdoc, architecture & physics), Aaron Hoyt (PhD student, UW Physics)
  • Alumni:
    • Shifan Xu (PhD intern, 2025, Yale CS), Xiang Fang (PhD intern, 2025, UCSD CS), Ming Wang (PhD intern, 2025, NCSU CS), Chuan Liu (PhD intern, 2025, U Rochester), Anastashia Jebraeilli (PhD intern, 2025/24, GSU Physics), Xinyi Li (Postdoc & PhD intern, U Utah CS, 2025/24/23/22/21, @TogetherAI), Peiyi Li (PhD intern, NCSU ECE, 2024), Matthew Burns (PhD intern, U Rochester ECE, 2024), Nidhi Munikote (undergraduate, USC CS, 2024), Yue Shi (PhD intern, UW Physics, 2024/23, @Thermo Fisher scientific), Zirui Mao (postdoc, Math, 2024/23, @PNNL), Habeebat Elwalily (undergraduate intern, Middlesex county college CS, 2023), Andrew Gui (high school intern, 2023, Undergraduate@UW), Fei Hua (PhD intern, Rutgers U CS, 2023/22/21, @Google), Muqing Zheng (PhD intern, Lehigh U Math, 2024/23/22/21, Postdoc@PNNL), Anbang Wu (PhD intern, UCSB CS, 2022, AP@Shanghai Jiaotong University),  Zheng Wang (PhD intern, UCSB CS, 2022), Anqi Guo (PhD intern, Boston U ECE, 2022, @Microsoft), Chunshu Wu (PhD intern, Boston U ECE, 2022, Postdoc@PNNL), Kendric Hood (PhD intern, Kent State U, 2022), Hongwu Peng (PhD intern, UConn ECE, 2022, @Adobe), Yuke Wang (PhD intern, UCSB CS, 2022, AP@Rice University), Mao Lin (PhD intern, UC Merced CS, 2022), Bo Fang (postdoc 2022, AP@UTA), Luke Zhang (PhD intern, Rice U, 2021, @Meta), Cheng Tan (postdoc 2021, @Google), Chenhao Xie (postdoc 2021, AP@Beihang University), Komail Dharsee (PhD intern, U Rochester, 2020), Lele Ma (PhD intern, W&M CS, 2020), Jou-An Chen (PhD intern, NCSU CS, 2020, @Qualcomm), Linghao Song (PhD intern, Duke U, 2019, AP@Yale University), Erika Leal (PhD intern, UT Arlington, 2019, AP@Baylor University), Tong Geng (postdoc 2021 and PhD intern 2019/18, Boston U ECE, AP@Rice)

Publications on Quantum:
  • [arXiv-25] Meng Wang, Chenxu Liu, Sean Garner, Samuel Stein, Yufei Ding, Prashant Nair, and Ang Li. “Tableau-Based Framework for Efficient Logical Quantum Compilation”. arXiv:2509.02721. 2025. [arXiv][Sch]
  • [arXiv-25] Peiyi Li, Chenxu Liu, Ji Liu, Huiyang Zhou, and Ang Li. “QNPU: Quantum Network Processor Unit for Quantum Supercomputers”. arXiv:2509.02827. 2025. [arXiv][Sch]
  • [HPEC-25] Aaron Hoyt, Jonathan Bersson, Sean Garner, Chenxu Liu, and Ang Li. “Implementation of Tensor Network Simulation TN-Sim under NWQ-Sim”, IEEE High Performance Extreme Computing Conference. IEEE, 2025.
  • [arXiv-25] Many authors. “The Role of Quantum Computing in Advancing Scientific High-Performance Computing: A Perspective from the ADAC Institute”. arXiv:2508.11765. 2025. [arXiv][Sch]
  • [Algorithms-25] Muqing Zheng, Chenxu Liu, Samuel Stein, Xiangyu Li, Johannes Mulmenstädt, Yousu Chen, and Ang Li. “An Early Investigation of the HHL Quantum Linear Solver for Scientific Applications”, Algorithms. MDPI, 2025. DOI:10.3390/a18020079 [arXiv][Link][Sch]
  • [MICRO-25] Hezi Zhang, Jixuan Ruan, Dean Tullsen, Yufei Ding, Ang Li, and Travis Humble. “Resource-adaptive Compilation of Photonic One-Way Quantum Computing”, IEEE/ACM International Symposium on Microarchitecture. [Sch][arXiv]
  • [QCE-25] Ethan Decker, Evan McKinney, Erik Gustafson, Lucas Goetz, Alex Jones, Ang Li, Alexander Schuckert, Samuel Stein, Gushu Li, and Eleanor Crane. “Symbolic Hamiltonian Compiler for Hybrid Qubit-Boson Processors”, IEEE International Conference on Quantum Computing and Engineering. IEEE, 2025. [Sch][arXiv]
  • [arXiv-25] Samuel Stein, Chenxu Liu, Shuwen Kan, Eleanor Crane, Yufei Ding, Ying Mao, Alexander Schuckert, and Ang Li. “Multi-Target Rydberg Gates via Spatial Blockade Engineering”. arXiv:2504.15282. 2025. [arXiv][Sch]
  • [arXiv-25] Nicholas Bauman, Muqing Zheng, Chenxu Liu, Nathan Myers, Ajay Panyala, Bo Peng, Ang Li, and Karol Kowalski. “Coupled Cluster Downfolding Theory in Simulations of Chemical Systems on Quantum Hardware”. arXiv:2507.01199. 2025. [arXiv][Sch]
  • [arXiv-25] Ming Wang, Ang Li, Frank Mueller. “Fully Parallelized BP Decoding for Quantum LDPC Codes can Outperform BP-OSD”. arXiv:2507.00254. 2025. [arXiv][Sch]
  • [arXiv-25] Many authors. “A Perspective on Quantum Computing Applications in Quantum Chemistry using 25-100 Logical Qubits”. arXiv:2506.19337. 2025. [arXiv][Sch]
  • [arXiv-25] Avinash Kumar, Meng Wang, Chenxu Liu, Ang Li, Prashant J. Nair, Poulami Das. “Context Switching for Secure Multi-programming of Near-Term Quantum Computers”. arXiv:2504.07048. 2025 [arXiv][Sch]
  • [arXiv-25] Sean Garner, Chenxu Liu, Meng Wang, Samuel Stein, and Ang Li. “STABSim: A Parallelized Clifford Simulator with Features Beyond Simulation”. arXiv:2507.03092. 2025. [arXiv][Sch]
  • [arXiv-25] Ethan Decker, Lucas Goetz, Evan McKinney, Erik Gustafson, Junyu Zhou, Yuhao Liu, Alex Jones, Ang Li, Alexander Schuckert, Samuel Stein, Eleanor Crane, Gushu Li. “Kernpiler: Compiler Optimization for Quantum Hamiltonian Simulation with Partial Trotterization”. arXiv:2504.07214. 2025. [arXiv][Sch]
  • [ASPLOS-25] Jixuan Ruan, Xiang Fang, Hezi Zhang, Ang Li, Travis Humble, and Yufei Ding. “PowerMove: Optimizing Compilation for Neutral Atom Quantum Computers with Zoned Architecture”, ACM International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2025. [Link][Sch]
  • [ISCA-25] Xiang Fang, Keyi Yin, Yuchen Zhu, Jixuan Ruan, Dean Tullsen, Zhiding Liang, Andrew Sornborger, Ang Li, Travis Humble, Yufei Ding, Yunong Shi. “CaliQEC: In-situ Qubit Calibration for Surface Code Quantum Error Correction”, International Symposium on Computer Architecture. ACM, 2025. DOI:10.1145/3695053.3731042 [Sch]Link]
  • [arXiv-25] Anastashia Jebraeilli, Chenxu Liu, Keyi Yin, Erik Lentz, Yufei Ding, and Ang Li. “STQS: A Unified System Architecture for Spatial Temporal Quantum Sensing”. arXiv:2502.17778. 2025. [arXiv][Sch]
  • [ASPLOS-25] Samuel Stein, Shifan Xu, Andrew Cross, Theodore Yoder, Ali Javadi-Abhari, Chenxu Liu, Kun Liu, Victor Zhou, Charles Guinn, Yufei Ding, Yongshan Ding, and Ang Li. “HetEC: Architectures for Heterogeneous Quantum Error Correction Codes”, ACM International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2025. DOI:10.1145/3676641.3716001 [arXiv][Sch][Link]
  • [ASPLOS-25] Keyi Yin, Hezi Zhang, Xiang Fang, Yunong Shi, Travis Humble, Ang Li, and Yufei Ding. “QECC-Synth: A Layout Synthesizer for Quantum Error Correction Codes on Sparse Architecture”, ACM International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2025. DOI:10.1145/3669940.3707236 [Sch][Link][arXiv]
  • [arXiv-24] Meng Wang, Chenxu Liu, Samuel Stein, Yufei Ding, Poulami Das, Prashant Nair, and Ang Li. “Optimizing FTQC Programs through QEC Transpiler and Architecture Codesign”. arXiv:2412.15434. 2024. [arXiv][Sch]
  • [QST-24] Matthew Burns, Chenxu Liu, Samuel Stein, Bo Peng, Karol Kowalski, and Ang Li. “GALIC: Hybrid Multi-Qubitwise Pauli Grouping for Quantum Computing Measurement”, Quantum Science and Technology. IOP, 2024. DOI:10.1088/2058-9565/ad9d74 [arXiv][Link][Sch]
  • [arXiv-24] Keyi Yin, Xiang Fang, Jixuan Ruan, Hezi Zhang, Dean Tullsen, Andrew Sornborger, Chenxu Liu, Ang Li, Travis Humble, Yufei Ding. “SymBreak: Mitigating Quantum Degeneracy Issues in QLDPC Code Decoders by Breaking Symmetry”. arXiv:2412.02885. 2024. [arXiv][Sch]
  • [NPJQI-24] Muqing Zheng, Bo Peng, Ang Li, Xiu Yang, Karol Kowalski. “Unleashed from Constrained Optimization: Quantum Computing for Quantum Chemistry Employing Generator Coordinate Method”, npj Quantum Information. DOI:10.1038/s41534-024-00916-8 [Link][arXiv][Sch]
  • [MICRO-24] Keyi Yin, Xiang Fang, Travis Humble, Ang Li, Yunong Shi, and Yufei Ding. “Surf-Deformer: Mitigating Dynamic Defects on Surface Code via Adaptive Deformation”, IEEE/ACM International Symposium on Microarchitecture, IEEE/ACM, 2024. [arXiv][Sch][Link]
  • [QCE-24] Shuwen Kan, Miguel Palma, Zefan Du, Samuel Stein, Chenxu Liu, Juntao Chen, Ang Li, and Ying Mao. “Benchmarking Optimizers for Qumode State Preparation with Variational Algorithms”, IEEE International Conference on Quantum Computing and Engineering, IEEE, 2024. [Sch][Link]
  • [QCE-24] Priyabrata Senapati, Samuel Yen-Chi Chen, Bo Fang, Tushar Athawale, Ang Li, Weiwen Jiang, Cheng Chang Lu, and Qiang Guan. “PQML: Enabling the Predictive Reproducibility on NISQ Machines for Quantum ML Applications”, IEEE International Conference on Quantum Computing and Engineering, IEEE, 2024. [Sch][Link]
  • [QCE-24] Shuwen Kan, Zefan Du, Miguel Palma, Samuel Stein, Chenxu Liu, Wenqi Wei, Juntao Chen, Ang Li, and Ying Mao. “Scalable Circuit Cutting and Scheduling in a Resource-constrained and Distributed Quantum System”, IEEE International Conference on Quantum Computing and Engineering, IEEE, 2024. [Sch][Link]
  • [TQC-24] James Ang, Gabriella Carini, Yanzhu Chen, Isaac Chuang, Michael Austin DeMarco, Sophia E. Economou, Alec Eickbusch, Andrei Faraon, Kai-Mei Fu, Steven M. Girvin, Andrew Houck, Paul Hilaire, Kevin Krsulich, Ang Li, Chenxu Liu, Yuan Liu, Margaret Martonosi, David C. McKay, James Misewich, Mark Ritter, Robert J. Schoelkopf, Samuel A. Stein, Sara Sussman, Hong X. Tang, Wei Tang, Teague Tomesh, Norm M. Tubman, Chen Wang, Nathan Wiebe, Yong-Xin Yao, Dillon C. Yost, and Yiyu Zhou. “ARQUIN: Architectures for Multinode Superconducting Quantum Computers”, ACM Transactions on Quantum Computing, ACM, 2024. DOI:10.1145/3674151 [Link][arXiv][Sch]
  • [FGCS-24] Many authors (incl. Ang Li). “Quantum-centric Supercomputing for Materials Science: A Perspective on Challenges and Future Directions”, Future Generation Computer Systems, Elsevier, 2024. DOI:10.1016/j.future.2024.04.060 [Link][arXiv][Sch]
  • [arXiv-24] Yue Shi, Chenxu Liu, Samuel Stein, Meng Wang, Muqing Zheng, and Ang Li. “Design of an Entanglement Purification Protocol Selection Module”. arXiv:2405.02555. 2024. [arXiv][Sch]
  • [arXiv-24] Ang Li, Chenxu Liu, Samuel Stein, In-Saeng Suh, Muqing Zheng, Meng Wang, Yue Shi, Bo Fang, Martin Roetteler, and Travis Humble. “TANQ-Sim: Tensorcore Accelerated Noisy Quantum System Simulation via QIR on Perlmutter HPC”. arXiv:2404.13184. 2024. [arXiv][Sch]
  • [EPJA-24] Ang Li, Alessandro Baroni, Ionel Stetcu, and Travis S. Humble. “Deep Quantum Circuit Simulations of Low-energy Nuclear States”, The European Physical Journal A, 2024. DOI:10.1140/epja/s10050-024-01286-7 [arXiv][Link][Sch]
  • [arXiv-24] Chenxu Liu, Samuel Stein, Muqing Zheng, James Ang, and Ang Li. “AQM: A Refresh of the Abstract Qubit Model for Quantum Codesign”. arXiv:2403.11329. 2024. [arXiv][Sch]
  • [PES-24] Muqing Zheng, Yousu Chen, Xiu Yang, and Ang Li. “Early Exploration of a Flexible Framework for Efficient Quantum Linear Solvers in Power Systems”, IEEE Power and Energy Society General Meeting, IEEE, 2024. [arXiv][Sch]Link]
  • [TQE-24] Ryan L'Abbate, Anthony D'Onofrio Jr., Samuel Stein, Samuel Yen-Chi Chen, Ang Li, Pin-Yu Chen, Juntao Chen, and Ying Mao. “A Quantum-Classical Collaborative Training Architecture Based on Quantum State Fidelity”, IEEE Transactions on Quantum Engineering, IEEE, 2024. DOI:10.1109/TQE.2024.3367234 [Link][Sch]
  • [ICASSP-24] Jinyang Li, Ang Li, and Weiwen Jiang. “Quapprox: A Framework for Benchmarking the Approximability of Variational Quantum Circuit”, IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, 2024. DOI:10.1109/ICASSP48485.2024.10447919 [Link][Sch]
  • [ASPLOS-24] Meng Wang, Bo Fang, Ang Li, and Prashant Nair. “Red-QAOA: Efficient Variational Optimization through Circuit Reduction”, ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024. DOI:10.1145/3620665.3640363 [Link][Sch]
  • [arXiv-23] Charles Guinn, Samuel Stein, Esin Tureci, Guus Avis, Chenxu Liu, Stefan Krastanov, Andrew A. Houck, Ang Li. “Co-Designed Superconducting Architecture for Lattice Surgery of Surface Codes with Quantum Interface Routing Card”, arXiv2312.01246. 2023. [Sch][arXiv]
  • [arXiv-23] Samuel Stein, Fei Hua, Chenxu Liu, Charles Guinn, James Ang, Eddy Zhang, Srivatsan Chakram, Yufei Ding, Ang Li. “Multi-mode Cavity Centric Architectures for Quantum Simulation”, arXiv:2309.15994. 2023. [Sch][arXiv]
  • [arXiv-23] Chenxu Liu, Meng Wang, Samuel Stein, Yufei Ding, Ang Li. “Quantum Memory: A Missing Piece in Quantum Computing Units”, arXiv:2309.14432. 2023. [arXiv][Sch]
  • [arXiv-23] Fei Hua, Meng Wang, Gushu Li, Bo Peng, Chenxu Liu, Muqing Zheng, Samuel Stein, Yufei Ding, Eddy Z. Zhang, Travis S. Humble, Ang Li. “QASMTrans: A QASM based Quantum Transpiler Framework for NISQ Devices”, Proceedings of the SC'23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis. ACM, 2023. [Sch][arXiv][Link]
  • [MICRO-23] Samuel Stein, Sara Sussman, Teague Tomesh, Charles Guinn, Esin Tureci, Sophia Fuhui Lin, Wei Tang, James Ang, Srivatsan Chakram, Ang Li, Margaret Martonosi, Fred T. Chong, Andrew A. Houck, Isaac L. Chuang, Michael Austin DeMarco. “Microarchitectures for Heterogeneous Superconducting Quantum Computers”, ACM, 2023. DOI:10.1145/3613424.3614300 [Link][Sch][arXiv]
  • [MICRO-23] Anbang Wu, Yufei Ding, Ang Li. “QuComm: Optimizing Collective Communication for Distributed Quantum Computing”, ACM, 2023. DOI:10.1145/3613424.3614253 [Link][arXiv][Sch]
  • [BigData-23] Anthony D’Onofrio Jr., Amir Hossain, Lesther Santana, Naseem Machlovi, Samuel Stein, Jinwei Liu, Ang Li, Ying Mao. “Distributed Quantum Learning with co-Management in a Multi-tenant Quantum System”, IEEE, 2023. DOI:10.1109/BigData59044.2023.10386676 [Link][arXiv][Sch]
  • [QCE-23] Jinyang Li, Zhepeng Wang, Zhirui Hu, Prasanna Date, Ang Li, Weiwen Jiang. “A Novel Spatial-Temporal Variational Quantum Circuit to Enable Deep Learning on NISQ Devices”, IEEE, 2023. DOI:10.1109/QCE57702.2023.00038 [Link][Sch]
  • [arXiv-23] Anbang Wu, Keyi Yin, Andrew Cross, Ang Li, Yufei Ding. “Enabling Full-Stack Quantum Computing with Changeable Error-Corrected Qubits”, arXiv:2305.07072. 2023. [Sch][arXiv]
  • [PRR-23] Muqing Zheng, Bo Peng, Nathan Wiebe, Ang Li, Xiu Yang, Karol Kowalski. “Quantum Algorithms for Generator Coordinate Methods”, American Physical Society, 2023. DOI:10.1103/PhysRevResearch.5.023200 [Link][Sch][arXiv]
  • [ISCA-23] Samuel Stein, Nathan Wiebe, Yufei Ding, James Ang, Ang Li. “Q-BEEP: Quantum Bayesian Error Mitigation Employing Poisson Modeling over the Hamming Spectrum for Quantum Error Mitigation”, ACM, 2023. DOI:10.1145/3579371.3589043 [Link][Sch][arXiv]
  • [HPCA-23] Yanhao Chen, Yuwei Jin, Fei Hua, Ari Hayes, Ang Li, Yunong Shi, Eddy Z. Zhang. “A Pulse Generation Framework with Augmented Program-aware Basis Gates and Criticality Analysis”, IEEE, 2023. DOI:10.1109/HPCA56546.2023.10070990 [Link][Sch]
  • [SEC-22] Samuel Stein, Ying Mao, James Ang, and Ang Li. “QuCNN: A Quantum Convolutional Neural Network with Entanglement based Backpropagation”, IEEE/ACM Symposium on Edge Computing. IEEE, 2022. DOI:10.1109/SEC54971.2022.00054 [arXiv] [Link] [Sch]
  • [TQC-22] Muqing Zheng, Ang Li, Tamás Terlaky, Xiu Yang. “A Bayesian Approach for Characterizing and Mitigating Gate and Measurement Errors”, ACM Transactions on Quantum Computing. ACM, 2022. DOI:10.1145/3563397 [Link] [arXiv] [Sch]
  • [arXiv-22] Fei Hua, Yuwei Jin, Ang Li, Yanhao Chen, Chi Zhang, Ari Hayes, Hang Gao, Eddy Z. Zhang. “A Synergistic Compilation Workflow for Tracking Crosstalk in Quantum Machines”. arXiv:2207.05751. 2022. [arXiv] [Sch]
  • [TQC-22] Ang Li, Samuel Stein, Sriram Krishnamoorthy, James Ang. “QASMBench: A Low-level QASM Benchmark Suite for NISQ Evaluation and Simulation”, ACM Transactions on Quantum Computing. ACM, 2022. DOI:10.1145/3550488 [Link] [arXiv] [GitHub] [Sch]
  • [Cluster-22] Bo Fang, M. Yusuf Ozkaya, Ang Li, Umit Catalyurek, Sriram Krishnamoorthy. “Efficient Hierarchy State Vector Simulation of Quantum Circuits via Acyclic Graph Partitioning”, IEEE Cluster. IEEE, 2022. DOI:10.1109/CLUSTER51413.2022.00041 [arXiv] [Link] [Sch] Best Paper Award!
  • [ISCA-22] Samuel Stein, Nathan Wiebe, Yufei Ding, Bo Peng, Karol Kowalski, Nathan Baker, James Ang, and Ang Li. “EQC: Ensembled Quantum Computing for Variational Quantum Algorithms”, International Symposium on Computer Architecture. ACM, 2022. DOI:10.1145/3470496.3527434 [arXiv] [Link] [Sch] Nominated for Best Paper Award!
  • [MLSys-22] Samuel Stein, Betis Baheri, Daniel Chen, Ying Mao, Qiang Guan, Shuai Xu, Caiwen Ding, and Ang Li. “QuClassi: A Hybrid Deep Neural Network Architecture based on Quantum State Fidelity”, Fifth Conference on Machine Learning and Systems. 2022. [Link][arXiv][Sch]
  • [QCE-21] Samuel A. Stein, Betis Baheri, Ray Marie Tischio, Ying Mao, Qiang Guan, Ang Li, Bo Fang, and Shuai Xu. “QuGAN: A Generative Adversarial Network Through Quantum States”, IEEE International Conference on Quantum Computing and Engineering, IEEE, 2021. DOI:10.1109/QCE52317.2021.00023 [Link][arXiv][Sch]
  • [SC-21] Ang Li, Bo Fang, Christopher Granade, Guen Prawiroatmodjo, Bettina Heim, Martin Roetteler, and Sriram Krishnamoorthy. “SV-Sim: Scalable PGAS-based State Vector Simulation of Quantum Circuits”, International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2021. DOI:10.1145/3458817.3476169 [Link][GitHub][Sch]
  • [SC-20] Ang Li, Omer Subasi, Xiu Yang, and Sriram Krishnamoorthy. “Density Matrix Quantum Circuit Simulation via the BSP Machine on Modern GPU Clusters”, International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, 2020. DOI:10.1109/SC41405.2020.00017 [Link][GitHub][Sch] (Nominated for Best Paper Award!)

Dynamical System:
  • [MICRO-25] Chuan Liu, Chunshu Wu, Ruibing Song, Ying Nian Wu, Yousu Chen, Ang Li, and Tong Geng. “DS-TIDE: Harnessing Dynamical Systems for Efficient Time-Independent Differential Equation Solving”, IEEE/ACM International Symposium on Microarchitecture. IEEE, 2025. [Sch]
  • [ICML-25] Chuan Liu, Chunshu Wu, Ruibing Song, Ang Li, Ying Nian Wu, and Tong Geng. “An Expressive and Self-Adaptive Dynamical System for Efficient Equation Learning”, International Conference on Machine Learning. 2025. [Sch][Link]
  • [ISCA-25] Chunshu Wu, Ruibing Song, Chuan Liu, Pouya Haghi, Ang Li, Michael Huang, and Tong Geng. “DS-TPU: Dynamical System for on-Device Lifelong Graph Learning with Nonlinear Node Interaction”, International Symposium on Computer Architecture. ACM, 2025. DOI:10.1145/3695053.3731091 [Sch][Link]
  • [ICLR-25] Ruibing Song, Chuan Liu, Chunshu Wu, Ang Li, Dongfang Liu, Ying Nian Wu, Tong Geng. “DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models”, International Conference on Learning Representations. 2025. [Sch][Link]
  • [ASP-DAC-25] Chuan Liu, Chunshu Wu, Ruibing Song, Yousu Chen, Ang Li, Michael Huang, and Tong Geng. “Nature-GL: A Revolutionary Learning Paradigm Unleashing Nature's Power in Real-World Spatial-Temporal Graph Learning”, Asia and South Pacific Design Automation Conference. ACM, 2025. DOI:10.1145/3658617.3703142 [Sch][Link]
  • [LOG-24] Chunshu Wu, Ruibing Song, Chuan Liu, Yuqing Wang, Yousu Chen, Ang Li, Dongfang Liu, Ying Nian Wu, Michael Huang, Tong Geng. “NP-NDS: A Nature-Powered Nonlinear Dynamical System for Power Grid Forecasting”, Learning on Graphs Conference, 2024. [Link][Sch]
  • [ISCA-24] Ruibing Song, Chunshu Wu, Chuan Liu, Ang Li, Michael Huang, and Tong Geng. “DS-GL: Advancing Graph Learning via Harnessing the Power of Nature within Dynamic Systems”, IEEE/ACM International Symposium on Computer Architecture, IEEE, 2024. DOI:10.1109/ISCA59077.2024.00014 [Link][Sch]
  • [ICLR-24] Chunshu Wu, Ruibing Song, Chuan Liu, Yunan Yang, Ang Li, Michael Huang, and Tong Geng. “NP-GL: Extending Power of Nature from Binary to Real-World Graph Learning”, International Conference on Learning Representations, ICLR, 2024. [Link][Sch]
  • [DAC-23] Zhuo Liu, Yunan Yang, Zhenyu Pan, Anshujit Sharma, Amit Hasan, Caiwen Ding, Ang Li, Michael Huang, Tong Geng. “Ising-CF: A Pathbreaking Collaborative Filtering Method Through Efficient Ising Machine Learning”, IEEE, 2023. DOI:10.1109/DAC56929.2023.10247860 [Link][Sch]
  • [DAC-23] Yixuan Luo, Cheng Tan, Nicolas Bohm Agostini, Ang Li, Antonino Tumeo, Nirav Dave, Tong Geng. “ML-CGRA: An Integrated Compilation Framework to Enable Efficient Machine Learning Acceleration on CGRAs”, IEEE, 2023. DOI:10.1109/DAC56929.2023.10247873 [Link][Sch]
  • [AAAI-23] Zhenyu Pan, Anshujit Sharma, Jerry Yao-Chieh Hu, Zhuo Liu, Ang Li, Han Liu, Michael Huang, Tong Geng. “Ising-Traffic: An Ising-based Framework for Traffic Congestion Prediction with Uncertainty”, AAAI Press, 2023. DOI:10.1609/aaai.v37i8.26121 [Link][Sch]

AI/ML:
  • [DAC-25] Pouya Haghi, Ali Falahati, Zahra Azad, Chunshu Wu, Ruibing Song, Chuan Liu, Ang Li, and Tong Geng. “DM-Tune: Quantizing Diffusion Models with Mixture-of-Gaussian Guided Noise Tuning”, Design Automation Conference. IEEE, 2025. [Sch]
  • [ICLR-25] Chuan Liu, Chunshu Wu, Shihui Cao, Mingkai Chen, James Chenhao Liang, Ang Li, Michael Huang, Chuang Ren, Ying Nian Wu, Dongfang Liu, Tong Geng. “Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models”, International Conference on Learning Representations. 2025. [arXiv][Sch][Link]
  • [PPoPP-25] Yuhang Liang, Bo Fang, Xinyi Li, Jie Ren, Ang Li, and Jieyang Chen. “ATTNChecker: Highly-Optimized Fault Tolerant Attention for Large Language Model Training”, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 2025. DOI:10.1145/3710848.3710870 [Sch][Link]
  • [EPSR-24] Qiuhua Huang, Renke Huang, Tianzhixi Yin, Sohom Datta, Xueqing Sun, Jason Hou, Jie Tan, Wenhao Yu, Yuan Liu, Xinya Li, Bruce Palmer, Ang Li, Xinda Ke, Marianna Vaiman, Song Wang, Yousu Chen. “Towards Intelligent Emergency Control for Large-scale Power Systems: Convergence of Learning, Physics, Computing and Control”, Electric Power Systems Research, 235 (2024). DOI:10.1016/j.epsr.2024.110648 [Link][Sch]
  • [arXiv-24] Mingkai Chen, Taowen Wang, James Chenhao Liang, Chuan Liu, Chunshu Wu, Qifan Wang, Ying Nian Wu, Michael Huang, Chuang Ren, Ang Li, Tong Geng, Dongfang Liu. “Inertial Confinement Fusion Forecasting via LLMs”. arXiv:2407.11098. 2024. [arXiv][Sch]
  • [AISY-24] Yanfei Li, Juejing Liu, Xiaodong Zhao, Wenjun Liu, Tong Geng, Ang Li, and Xin Zhang. “Accurate and Data-Efficient Micro-XRD Phase Identification Using Multi-Task Learning: Application to Hydrothermal Fluids”, Advanced Intelligent Systems, Wiley, 2024. DOI:10.1002/aisy.202400204 [Link][arXiv][Sch]
  • [ATC-24] Zheng Wang, Yuke Wang, Boyuan Feng, Guyue Huang, Dheevatsa Mudigere, Bharath Muthiah, Ang Li, and Yufei Ding. “OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model”, USENIX Annual Technical Conference, USENIX, 2024. [Sch][Link]
  • [ICPE-24] Hongwu Peng, Caiwen Ding, Tong Geng, Sutanay Choudhury, Kevin Barker, and Ang Li. “Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs”, 15th ACM/SPEC International Conference on Performance Engineering, ACM, 2024. DOI:10.1145/3629527.3651428 [Link][Sch]
  • [JCIM-24] Hatem Helal, Jesun Firoz, Jenna Bilbrey, Henry Sprueill, Kristina Herman, Mario Krell, Tom Murray, Manuel Roldan, Mike Kraus, Ang Li, Payel Das, Sotiris Xantheas, and Sutanay Choudhury. “Acceleration of Graph Neural Network-based Prediction Models in Chemistry via Co-design Optimization on Intelligence Processing Units”, Journal of Chemical Information and Modeling, ACS, 2024. DOI:10.1021/acs.jcim.3c01312 [Link][Sch]
  • [JPCC] Xiaodong Zhao, Yixuan Luo, Juejing Liu, Wenjun Liu, Kevin Rosso, Xiaofeng Guo, Tong Geng, Ang Li, Xin Zhang. “Machine Learning Automated Analysis of Enormous Synchrotron X-Ray Diffraction Datasets”, American Chemical Society, 2023. DOI:10.1021/acs.jpcc.3c03572 [Link][Sch]
  • [ICCV-23] Hongwu Peng, Shaoyi Huang, Tong Zhou, Yukui Luo, Chenghong Wang, Zigeng Wang, Jiahui Zhao, Xi Xie, Ang Li, Tong Geng, Kalleel Mahmood, Wujie Wen, Xiaolin Xu, Caiwen Ding. “AutoReP: Automatic ReLU Replacement for Fast Private Network Inference”, IEEE/CVF, 2023. [Link][Sch]
  • [TCC-22] Ying Mao, Vaishali Sharma, Wenjia Zheng, Qiang Guan, Long Cheng, and Ang Li. “Elastic Resource Management for Deep Learning Applications in a Container Cluster”, IEEE Transactions on Cloud Computing. IEEE, 2022. DOI:10.1109/TCC.2022.3194128 [Link] [Sch]
  • [arXiv-22] Yanfei Li, Ang Li, Huimin Yu. “Searching Similarity Measure for Binarized Neural Networks”, arXiv:2206.03325. 2022. [arXiv] [Sch]
  • [TPWRS-22] Renke Huang, Yujiao Chen, Tianzhixi Yin, Qiuhua Huang, Jie Tan, Wenhao Yu, Xinya Li, Ang Li, Yan Du. “Learning and Fast Adaptation for Grid Emergency Control via Deep Meta Reinforcement Learning”, IEEE Transactions on Power Systems. IEEE, 2022. DOI:10.1109/TPWRS.2022.3155117 [Link][arXiv][Sch]
  • [MLSys-22] Cheng Wan, Youjie Li, Ang Li, Nam Sung Kim, and Yingyan Lin. “BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling”, Fifth Conference on Machine Learning and Systems. 2022. [Link][arXiv][Sch]
  • [TST-22] Yanfei Li, Tong Geng, Ang Li, and Huimin Yu. “GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm”, Journal of Tsinghua Science and Technology. IEEE, 2022. DOI:10.26599/TST.2021.9010084 [arXiv][Link][Sch]
  • [MICPRO] Yanfei Li, Tong Geng, Ang Li, and Huimin Yu. “BCNN: Binary Complex Neural Network”, Microprocessors and Microsystems, Elsevier, 2021. DOI:10.1016/j.micpro.2021.104359 [Link][Sch]
  • [ICCAD-21] Daniel Manu, Yi Sheng, Junhuan Yang, Jieren Deng, Tong Geng, Ang Li, Caiwen Ding, Weiwen Jiang, and Lei Yang. “FL-DISCO: Federated Generative Adversarial Network for Graph-based Molecule Drug Discovery”, IEEE/ACM International Conference on Computer-Aided Design, IEEE, 2021. DOI:10.1109/ICCAD51958.2021.9643440 [Link][Sch]
  • [TPWRS-21] Renke Huang, Yujiao Chen, Tianzhixi Yin, Xinya Li, Ang Li, Jie Tan, Wenhao Yu, Yuan Liu, and Qiuhua Huang. “Accelerated Derivative-free Deep Reinforcement Learning for Large-scale Grid Emergency Voltage Control”, IEEE Transactions on Power Systems, IEEE, 2021. DOI:10.1109/TPWRS.2021.3095179 [Link][Sch]

VR/AR:
  • [ICLR-25] Hexu Zhao, Haoyang Weng, Daohan Lu, Ang Li, Jinyang Li, Aurojit Panda, Saining Xie. “On Scaling Up 3D Gaussian Splatting Training”, International Conference on Learning Representations. 2025. [arXiv][Sch][Link]
  • [HPCA-19] Chenhao Xie, Xingyao Zhang, Ang Li, Xin Fu, and Shuaiwen Leon Song. “PIM-VR: Erasing Motion Anomalies in Highly-Interactive Virtual Reality World with Customized Memory Cube”, IEEE International Symposium on High-Performance Computer Architecture, IEEE, 2019. DOI:10.1109/HPCA.2019.00013 [Link][Sch]


Architecture:
  • [MICRO-24] Pouya Haghi, Chunshu Wu, Zahra Azad, Yanfei Li, Andrew Gui, Yunchen Hao, Ang Li, and Tong Geng. “Bridging the Gap between LLMs and LNS with Dynamic Data Format and Architecture Codesign”, IEEE/ACM International Symposium on Microarchitecture, IEEE/ACM, 2024. [Sch][Link]
  • [ICS-24] Pouya Haghi, Cheng Tan, Anqi Guo, Chunshu Wu, Dongfang Liu, Ang Li, Anthony Skjellum, Tong Geng, and Martin Herbordt. “SmartFuse: Reconfigurable Smart Switches to Accelerate Fused Collectives in HPC Applications”, International Conference on Supercomputing, ACM, 2024. DOI:10.1145/3650200.3656616 [Link][Sch]
  • [TC-24] Chunshu Wu, Chen Yang, Sahan Bandara, Tong Geng, Anqi Guo, Pouya Haghi, Ang Li, and Martin Herbordt. “FPGA-Accelerated Range-Limited Molecular Dynamics”, IEEE Transactions on Computers, IEEE, 2024. DOI:10.1109/TC.2024.3375613 [Link][Sch]
  • [SC-23] Chunshu Wu, Tong Geng, Anqi Guo, Saha Bandara, Pouya Haghi, Chuan Liu, Ang Li, Martin Herbordt. “FASDA: An FPGA-Aided, Scalable and Distributed Accelerator for Range-Limited Molecular Dynamics”, ACM, 2023. DOI:10.1145/3581784.3607100 [Link][Sch]
  • [FPL-22] Anqi Guo, Tong Geng, Yongan Zhang, Pouya Haghi, Chunshu Wu, Cheng Tan, Yingyan Lin, Ang Li, Martin Herbordt. “A Framework for Neural Network Inference on FPGA-Centric SmartNICs”, International Conference on Field Programmable Logic and Applications. IEEE, 2022. DOI:10.1109/FPL57034.2022.00071 [Link] [Sch]
  • [FPL-22] Chengming Zhang, Tong Geng, Anqi Guo, Martin Herbordt, Ang Li, Dingwen Tao. “H-GCN: A Graph Convolutional Network Accelerator on Xilinx Versal AI Engines”, International Conference on Field Programmable Logic and Applications. IEEE, 2022. DOI:10.1109/FPL57034.2022.00040 [Link] [arXiv] [Sch]
  • [ICS-22] Cheng Tan, Thierry Tambe, Jeff Zhang, Bo Fang, Tong Geng, Gu-Yeon Wei, David Brooks, Antonino Tumeo, Ganesh Gopalakrishnan, Ang Li. “ASAP - Automatic Synthesis of Area-Efficient and Precision-Aware CGRA”, International Conference on Supercomputing. ACM, 2022. DOI:10.1145/3524059.3532359 [Link] [Sch]
  • [ICS-22] Chengming Zhang, Sian Jin, Tong Geng, Jiannan Tian, Ang Li, Dingwen Tao. “CEAZ: Accelerating Parallel I/O Via Hardware-Algorithm Co-Designed Adaptive Lossy Compression”, International Conference on Supercomputing. ACM, 2022. DOI:10.1145/3524059.3532362 [Link] [arXiv] [Sch]
  • [DAC-22] Hongwu Peng, Shaoyi Huang, Shiyang Chen, Bingbing Li, Tong Geng, Ang Li, Weiwen Jiang, Wujie Wen, Jinbo Bi, Hang Liu, Caiwen Ding. “A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining”, Design Automation Conference. ACM, 2022. DOI:10.1145/3489517.3530585 [Link][Sch]
  • [HPCA-22] Cheng Tan, Nicolas Bohm Agostini, Tong Geng, Chenghao Xie, Jiajia Li, Ang Li, Kevin Barker, Antonino Tumeo. “DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications CGRAs”, 28th IEEE International Symposium on High-Performance Computer Architecture. IEEE, 2022. DOI:10.1109/HPCA53966.2022.00030 [Link][Sch]
  • [HPCA-22] Haoran You, Tong Geng, Yongan Zhang, Ang Li, Yingyan Lin. “GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design”, 28th IEEE International Symposium on High-Performance Computer Architecture. IEEE, 2022. DOI:10.1109/HPCA53966.2022.00041 [Link][arXiv][Sch]
  • [ICCD-21] Cheng Tan, Tong Geng, Chenhao Xie, Nicolas Bohm Agostini, Jiajia Li, Ang Li, Kevin Barker, and Antonino Tumeo. “DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications”, IEEE International Conference on Computer Design, IEEE, 2021. DOI:10.1109/ICCD53106.2021.00018 [Link][Sch] Best Paper Award!
  • [HPEC-21] Tong Geng, Chunshu Wu, Cheng Tan, Chenhao Xie, Anqi Guo, Pouya Haghi, Sarah Yuan He, Jiajia Li, Martin Herbordt, Ang Li. “A Survey: Handling Irregularities in Neural Network Acceleration with FPGAs”, IEEE High Performance Extreme Computing Conference, IEEE, 2021. DOI:10.1109/HPEC49654.2021.9622877 [Link][Sch]
  • [MICRO-21] Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin Herbordt, Yingyan Lin, and Ang Li. “I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization”, IEEE/ACM International Symposium on Microarchitecture, ACM, 2021. DOI:10.1145/3466752.3480113 [Link][Sch]
  • [ICCAD-21] Yongan Zhang, Haoran You, Yonggan Fu, Tong Geng, Ang Li, and Yingyan Lin. “G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency”, IEEE/ACM International Conference on Computer-Aided Design, IEEE, 2021. DOI:10.1109/ICCAD51958.2021.9643549 [Link][Sch]
  • [ICCAD-21] Hongwu Peng, Shiyang Chen, Zhepeng Wang, Junhuan Yang, Scott Weitze, Tong Geng, Ang Li, Jinbo Bi, Minghu Song, Weiwen Jiang, Hang Liu, and Caiwen Ding. “Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search”, IEEE/ACM International Conference on Computer-Aided Design, IEEE, 2021. DOI:10.1109/ICCAD51958.2021.9643528 [Link][Sch]
  • [ASAP-21] Hongwu Peng, Shanglin Zhou, Scott Weitze, Jiaxin Li, Sahidul Islam, Tong Geng, Ang Li, Wei Zhang, Minghu Song, Mimi Xie, Hang Liu, and Caiwen Ding. “Binary Complex Neural Network Acceleration on FPGA”, IEEE International Conference on Application-specific Systems, Architectures, and Processors, IEEE, 2021. DOI:10.1109/ASAP52443.2021.00021 [Link][Sch]
  • [ASAP-21] Cheng Tan, Nicolas Bohm Agostini, Jeff Zhang, Marco Minutoli, Vito Giovanni Castellana, Chenhao Xie, Tong Geng, Ang Li, Kevin J. Barker, and Antonino Tumeo. “OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays”, IEEE International Conference on Application-specific Systems, Architectures, and Processors, IEEE, 2021. DOI:10.1109/ASAP52443.2021.00029 [Link][GitHub][Sch]
  • [TPDS-21] Cheng Tan, Chenhao Xie, Andres Marquez, Antonino Tumeo, Kevin Barker, and Ang Li. “ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing”, IEEE Transactions on Parallel and Distributed Systems, IEEE, 2021. DOI:10.1109/TPDS.2021.3081074 [Link][Sch]
  • [DATE-21] Cheng Tan, Chenhao Xie, Ang Li, Kevin Barker, and Antonino Tumeo. “AURORA: Automated Refinement of Coarse-Grained Reconfigurable Accelerators”, Design, Automation & Test in Europe Conference, IEEE, 2021. DOI:10.23919/DATE51398.2021.9473955 [Link][Sch]
  • [ICCD-20] Cheng Tan, Chenhao Xie, Ang Li, Kevin Barker, and Antonino Tumeo. “OpenCGRA: An Open-Source Framework for Modeling, Testing, Evaluating CGRAs”, IEEE International Conference on Computer Design, IEEE, 2020. DOI:10.1109/ICCD50377.2020.00070 [Link][GitHub][Sch]
  • [HPEC-20] Tong Geng, Chunshu Wu, Cheng Tan, Bo Fang, Ang Li, and Martin Herbordt. “CQNN: a CGRA-based QNN Framework”, IEEE High Performance Extreme Computing Conference, IEEE, 2020. DOI:10.1109/HPEC43674.2020.9286194 [Link][Sch]
  • [TPDS-20] Tong Geng, Ang Li, Tianqi Wang, Chunshu Wu, Yanfei Li, Runbin Shi, Wei Wu, and Martin Herbordt. “O3BNN-R: An Out-Of-Order Architecture for High-Performance and Regularized BNN Inference”, IEEE Transactions on Parallel and Distributed Systems, IEEE, 2020. DOI:10.1109/TPDS.2020.3013637 [Link][Sch]
  • [MICRO-20] Tong Geng, Ang Li, Runbin Shi, Tianqi Wang, Yanfei Li, Pouya Haghi, Antonino Tumeo, Shuai Che, Steve Reinhardt, and Martin Herbordt. “AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing”, IEEE/ACM International Symposium on Microarchitecture, IEEE/ACM, 2020. DOI:10.1109/MICRO50266.2020.00079 [Link][arXiv][Sch]
  • [TC-20] Tianqi Wang, Tong Geng, Ang Li, Xi Jin, and Martin Herbordt. “FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters”, IEEE Transactions on Computers, IEEE, 2020. DOI:10.1109/TC.2020.3000118 [Link][arXiv][Sch]
  • [ICS-20] Runbin Shi, Peiyan Dong, Tong Geng, Yuhao Ding, Hayden So, Martin Herbordt, Ang Li, and Yanzhi Wang. “CSB-RNN: A Faster-than-Realtime RNN Acceleration Framework with Compressed Structured Blocks”, International Conference on Supercomputing, ACM, 2020. DOI:10.1145/3392717.3392749 [Link][Sch]
  • [ICS-19] Tong Geng, Tianqi Wang, Chunshu Wu, Chen Yang, Wei Wu, Ang Li, and Martin Herbordt. “O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning”, International Conference on Supercomputing, ACM, 2019. DOI:10.1145/3330345.3330386 [Link][Sch]
  • [ASAP-19] Tong Geng, Tianqi Wang, Chunshu Wu, Chen Yang, Shuaiwen Leon Song, Ang Li, and Martin Herbordt. “LP-BNN: Ultra-Low-Latency BNN Inference with Layer Parallelism”, IEEE International Conference on Application-specific Systems, Architectures, and Processors, IEEE, 2019. DOI:10.1109/ASAP.2019.00-43 [Link][Sch]
  • [ASICON-17] Wenfeng Zhao, Ang Li, Yi Wang, and Yajun Ha. “Analysis and Design of Energy-Efficient Data-Dependent SRAM”, 12th International Conference on ASIC, IEEE, 2017. DOI:10.1109/ASICON.2017.8252625 [Link][Sch]
  • [ASPDAC-15] Mohammad Shihabul Haque, Ang Li, Akash Kumar, and Qingsong Wei. “Accelerating non-volatile/hybrid processor cache design space exploration for application specific embedded systems”, 20th Asia and South Pacific Design Automation Conference, IEEE, 2015. DOI:10.1109/ASPDAC.2015.7059045 [Link][Sch]


HPC :
  • [FLD-25] Zirui Mao, Shenyang Hu, and Ang Li. “A GPU Accelerated Mixed-Precision Finite Difference Informed Random Walker (FDiRW) Solver for Strongly Inhomogeneous Diffusion Problems”, International Journal for Numerical Methods in Fluids. Elsevier, 2025. DOI:10.1002/fld.5394 [Sch][Link]
  • [EABE-24] Zirui Mao, Xinyi Li, Shenyang Hu, Ganesh Gopalakrishnan, and Ang Li. “A GPU Accelerated Mixed-precision Smoothed Particle Hydrodynamics Framework with Cell-based Relative Coordinates”, Engineering Analysis with Boundary Elements, Elsevier, 2024. DOI:10.1016/j.enganabound.2024.01.020 [Link][Sch]
  • [iEnergy-22] Yousu Chen, Zhenyu Huang, Shuangshuang Jin, Ang Li. “Power System Computing: Then, Now, and the Future”, IEEE iEnergy Journal. IEEE, 2022. DOI:10.23919/IEN.2022.0037 [Link] [Sch]
  • [IISWC-20] Jiajia Li, Mahesh Lakshminarasimhan, Xiaolong Wu, Ang Li, Catherine Olschanowsky, and Kevin Barker. “A Sparse Tensor Benchmark Suite for CPUs and GPUs”, IEEE International Symposium on Workload Characterization, IEEE, 2020. DOI:10.1109/IISWC50251.2020.00027 [Link][arXiv][GitLab][Sch]
  • [Springer] Jiajia Li, Yuchen Ma, Xiaolong Wu, Ang Li, and Kevin Barker. “PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite”, CCF Transactions on High Performance Computing, Springer, 2019. DOI:10.1007/s42514-019-00012-w [Link][arXiv][Sch]
  • [SC-17] Ang Li, Weifeng Liu, Xu Liu, Mads R.B. Kristensen, Brian Vinter, Hao Wang, Kaixi Hou, Andres Marquez, and Shuaiwen Leon Song. “Exploring And Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels”, International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2017. DOI:10.1145/3126908.3126931 [Link][Sch] (Nominated for Best Paper Award!)
  • [CCPE] Weifeng Liu, Ang Li, Jonathan Hogg, Iain Duff, and Brian Vinter. “Fast Synchronization-Free Algorithms for Parallel Sparse Triangular Solves with Multiple Right-Hand Sides”, Concurrency and Computation: Practice and Experience, Wiley, 2017. DOI:10.1002/cpe.4244 [Link][Sch]
  • [IPDPS-16] Ang Li, Shuaiwen Leon Song, Eric Brugel, Akash Kumar, Daniel Chavarria-Miranda, and Henk Corporaal. “X: A Comprehensive Analytic Model for Parallel Machines”, 30th IEEE International Parallel & Distributed Processing Symposium, IEEE, 2016. DOI:10.1109/IPDPS.2016.89 [Link][Sch]
  • [ISIC-14] Qiang Wu, Yajun Ha, Akash Kumar, Shaobo Luo, Ang Li, and Shihab Mohamed. “A Heterogeneous Platform with GPU and FPGA for Power Efficient High Performance Computing”, 14th International Symposium on Integrated Circuit, IEEE, 2014. DOI:10.1109/ISICIR.2014.7029447 [Link][Sch]

GPU :
  • [ATC-25] Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Ang Li, and Yufei Ding. “GMI-DRL: Empowering Multi-GPU DRL with Adaptive-Grained Parallelism”, USENIX Annual Technical Conference. USENIX, 2025. [Link][Sch]
  • [PPoPP-25] Jou-An Chen, Hsin-Hsuan Sung, Ang Li, and Xipeng Shen. “Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering”, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 2025. DOI:10.1145/3710848.3710881 [Sch] [Link]
  • [Cluster-24] Bo Fang, Xinyi Li, Harvey Dam, Cheng Tan, Siva Kumar Sastry Hari, Timothy Tsai, Ignacio Laguna, Dingwen Tao, Ganesh Gopalakrishnan, Prashant Nair, Kevin Barker, and Ang Li. “Understanding Mixed Precision GEMM with MPGemmFI: Insights into Fault Resilience”, IEEE International Conference on Cluster Computing, IEEE, 2024. DOI:10.1109/CLUSTER59578.2024.00022 [Link][Sch]
  • [IEEE Access-24] Wei Sun, Ang Li, Sander Stuijk, and Henk Corporaal. “How much can we gain from Tensor Kernel Fusion on GPUs”, IEEE Access, IEEE, 2024. DOI:10.1109/ACCESS.2024.3411473 [Link][Sch]
  • [CCGrid-24] Xinyi Li, Ang Li, Bo Fang, Ignacio Laguna, and Ganesh Gopalakrishnan. “A Testing-Guided Approach to Characterize NVIDIA and AMD Matrix Accelerator Numerics”, 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, IEEE, 2024. DOI:10.1109/CCGrid59990.2024.00014 [Link][Sch]
  • [ASPLOS-24] Zheng Wang, Yuke Wang, Jiaqi Deng, Ang Li, and Yufei Ding. “RAP: Resource-aware Automated GPU Sharing for Multi-GPU Recommendation Model Training and Input Preprocessing”, ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024. DOI:10.1145/3620665.3640406 [Link][Sch]
  • [ICS-23] Jou-An Chen, Hsin-Hsuan Sung, Xipeng Shen, Sutanay Choudhury, Ang Li. “BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs”, ACM, 2023. DOI:10.1145/3577193.3593725 [Link][Sch]
  • [OSDI-23] Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Kevin Barker, Ang Li, Yufei Ding. “Accelerating Graph Neural Networks with Fine-grained Intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms”, USENIX, 2023. [Link][Sch]
  • [HPDC-23] Xinyi Li, Ignacio Laguna, Bo Fang, Katarzyna Swirydowicz, Ang Li, Ganesh Gopalakrishnan. “Practical GPU Floating-Point Exception Detection, Diagnosis and Repair”, ACM, 2023. DOI:10.1145/3588195.3592991 [Link][Sch]
  • [JPDC-23] Jou-An Chen, Hsin-Hsuan Sung, Xipeng Shen, Nathan Tallent, Kevin Barker, Ang Li. “Accelerating Matrix-Centric Graph Processing on GPU through Bit-Level Optimizations”, Elsevier, 2023. DOI:10.1016/j.jpdc.2023.02.013 [Link][Sch]
  • [arXiv-22] Jieyang Chen, Chenhao Xie, Jesun Firoz, Jiajia Li, Shuaiwen Leon Song, Kevin Barker, Mark Raugas, Ang Li. “MSREP: A Fast yet Light Sparse Matrix Framework for Multi-GPU Systems”. arXiv:2209.07552. 2022. [arXiv] [Sch]
  • [TPDS-22] Wei Sun, Ang Li, Tong Geng, Sander Stuijk, Henk Corporaal. “Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numerical Behaviors”, IEEE Transactions on Parallel and Distributed Systems. IEEE, 2022. DOI:10.1109/TPDS.2022.3217824 [arXiv] [Link] [Sch]
  • [IPDPS-22] Jou-An Chen, Hsin-Hsuan Sung, Nathan Tallent, Kevin Barker, Xipeng Shen, Ang Li. “Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU”, 36th IEEE International Parallel & Distributed Processing Symposium. IEEE, 2022. DOI:10.1109/IPDPS53621.2022.00056 [arXiv][Link][Sch]
  • [Correctness-21] Ganesh Gopalakrishnan, Ignacio Laguna, Ang Li, Pavel Panchekha, Cindy Rubio-Gonzalez, and Zachary Tatlock. “Guarding Numerics Amidst Rising Heterogeneity”, 5th IEEE/ACM International Workshop on Software Correctness for HPC Applications, IEEE, 2021. DOI:10.1109/Correctness54621.2021.00007 [Link][Sch]
  • [SC-21] Boyuan Feng, Yuke Wang, Tong Geng, Ang Li, and Yufei Ding. “APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores”, International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2021. DOI:10.1145/3458817.3476157 [Link][Sch]
  • [ICPP-21] Chenhao Xie, Jieyang Chen, Jesun Firoz, Jiajia Li, Shuaiwen Song, Kevin Barker, Mark Raugas, and Ang Li. “Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures”, International Conference on Parallel Processing, ACM, 2021. DOI:10.1145/3472456.3472478 [Link][arXiv][Sch]
  • [TPDS-20] Ang Li and Simon Su. “Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs”, IEEE Transactions on Parallel and Distributed Systems, IEEE, 2020. DOI:10.1109/TPDS.2020.3045828 [Link][arXiv][GitHub][Sch]
  • [HPEC-20] Jesun S Firoz, Ang Li, Jiajia Li, and Kevin Barker. “On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics”, IEEE High Performance Extreme Computing Conference, IEEE, 2020. DOI:10.1109/HPEC43674.2020.9286152 [Link][Sch]
  • [ICPP-20] Pengfei Zou, Ang Li, Kevin Barker, and Rong Ge. “Detecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines”, International Conference on Parallel Processing, ACM, 2020. DOI:10.1145/3404397.3404435 [Link][GitHub][Sch]
  • [CCGrid-20] Pengfei Zou, Ang Li, Kevin Barker, and Rong Ge. “Indicator-Directed Dynamic Power Management for Iterative Workloads on GPU-Accelerated Systems”, IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, IEEE, 2020. DOI:10.1109/CCGrid49817.2020.00-37 [Link][Sch]
  • [TPDS-19] Ang Li, Shuaiwen Leon Song, Jieyang Chen, Jiajia Li, Xu Liu, Nathan Tallent, and Kevin Barker. “Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect”, IEEE Transactions on Parallel and Distributed Systems, IEEE, 2019. DOI:10.1109/TPDS.2019.2928289 [Link][arXiv][Sch][GitHub]
  • [SC-19] Ang Li, Tong Geng, Tianqi Wang, Martin Herbordt, Shuaiwen Leon Song, and Kevin Barker. “BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets”, International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2019. DOI:10.1145/3295500.3356169 [Link][Sch][GitHub]
  • [IISWC-19] Pengfei Zou, Ang Li, Kevin Barker, and Rong Ge. “Fingerprinting Anomalous Computation with RNN for GPU-accelerated HPC Machines”, IEEE International Symposium on Workload Characterization, IEEE, 2019. DOI:10.1109/IISWC47752.2019.9042165 [Link][Sch]
  • [IISWC-18] Ang Li, Shuaiwen Leon Song, Jieyang Chen, Xu Liu, Nathan Tallent, and Kevin Barker. “Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite”, IEEE International Symposium on Workload Characterization, IEEE, 2018. DOI:10.1109/IISWC.2018.8573483 [Link][Sch][GitHub][Supplementary File] (Nominated for Best Paper Award!)
  • [ICS-18] Ang Li, Weifeng Liu, Linnan Wang, Kevin Barker, and Shuaiwen Leon Song. “Warp-Consolidation: A Novel Execution Model for GPUs”, International Conference on Supercomputing, ACM, 2018. DOI:10.1145/3205289.3205294 [Link][Sch]
  • [CGO-18] Du Shen, Ang Li, Shuaiwen Leon Song, and Xu Liu. “CUDAAdvisor: LLVM-based Runtime Profiling for Modern GPUs”, International Symposium on Code Generation and Optimization, ACM, 2018. DOI:10.1145/3168831 [Link][Sch][GitHub]
  • [PPoPP-18] Linnan Wang, Jinmian Ye, Yiyang Zhao, Wei Wu, Ang Li, Shuaiwen Leon Song, Zenglin Xu, and Tim Kraska. “SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks”, ACM Symposium on Principles and Practice of Parallel Programming, ACM, 2018. DOI:10.1145/3178487.3178491 [Link][Sch][GitHub]
  • [MICRO-17] Ang Li, Wenfeng Zhao, and Shuaiwen Leon Song. “BVF: Enabling Significant On-Chip Power Savings via Bit-Value-Favor for Throughput Processors”, 50th Annual IEEE/ACM International Symposium on Microarchitecture, ACM, 2017. DOI:10.1145/3123939.3123944 [Link][Sch]
  • [ASPLOS-17] Ang Li, Shuaiwen Leon Song, Weifeng Liu, Xu Liu, Akash Kumar, and Henk Corporaal. “Locality-Aware CTA Clustering for Modern GPUs”, 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ACM, 2017. DOI:10.1145/3093337.3037709 [Link][Sch]
  • [PhD Thesis] Ang Li. “GPU Performance Modeling and Optimization”, PhD Thesis, TU/e and NUS, 2016. ISBN:978-90-386-4155-3 [Link][Sch]
  • [EuroPar-16] Weifeng Liu, Ang Li, Jonathan Hogg, Iain Duff, and Brian Vinter. “A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves”, 22nd International European Conference on Parallel and Distributed Computing, Springer, 2016. DOI:10.1007/978-3-319-43659-3_45 [Link][GitHub][Sch]
  • [ICS-16] Ang Li, Shuaiwen Leon Song, Mark Wijtvliet, Akash Kumar, and Henk Corporaal. “SFU-Driven Transparent Approximation Acceleration on GPUs”, 27th International Conference on Supercomputing, ACM, 2016. DOI:10.1145/2925426.2926255 [Link][Sch]
  • [DATE-16] Ang Li, Shuaiwen Leon Song, Akash Kumar, Eddy Z. Zhang, Daniel Chavarria, and Henk Corporaal. “Critical Points Based Register-Concurrency Autotuning for GPUs”, Design, Automation & Test in Europe Conference, IEEE, 2016. DOI:10.1109/DATE.2016.7459506 [Link][Sch]
  • [SC-15] Ang Li, Gert-Jan Van Den Braak, Akash Kumar, and Henk Corporaal. “Adaptive and Transparent Cache Bypassing on GPUs”, International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2015. DOI:10.1145/2807591.2807606 [Link][Sch][Supplementary File] (Nominated for Best Paper Award and Best Student Paper Award!)
  • [DSD-15] Runbin Shi, Zheng Xu, Zhihao Sun, Maurice Peemen, Ang Li, Henk Corporaal, Di Wu. “A Locality Aware Convolutional Neural Networks Accelerator”, 18th Euromicro Conference on Digital System Design, IEEE, 2015. DOI:10.1109/DSD.2015.70 [Link][Sch]
  • [HPDC-15] Ang Li, Akash Kumar, Y.C. Tay, and Henk Corporaal. “Transit: A Visual Analytical Model for Multithreaded Machine”, 24th International Symposium on High-Performance Parallel and Distributed Computing, ACM, 2015. DOI:10.1145/2749246.2749265 [Link][Sch]
  • [ICS-15] Ang Li, Gert-Jan Van Den Braak, Akash Kumar, and Henk Corporaal. “Fine-Grained Synchronizations and Dataflow Programming on GPUs”, 26th International Conference on Supercomputing, ACM, 2015. DOI:10.1145/2751205.2751232 [Link][Sch]
  • [MICPRO-15] Ang Li, Akash Kumar, Yajun Ha, and Henk Corporaal. “Correlation Ratio Based Volume Image Registration on GPUs”, Microprocessors and Microsystems, vol. 39, no. 8, pp. 998–1011, Elsevier, 2015. DOI:10.1016/j.micpro.2015.04.002 [Link][Sch]
  • [DSD-14] Ang Li and Akash Kumar. “Accelerating Volume Image Registration through Correlation Ratio based Methods on GPUs”, 17th International Conference on Digital Systems Design, IEEE, 2014. DOI:10.1109/DSD.2014.29 [Link][Sch]