Thursday, December 7, 2017

           "外面不安的世界骚动的心情,不能熄灭曾经你拥有的炙热的心."


Dr. Ang Li is a senior computer scientist in the high-performance-computing (HPC) group of Pacific Northwest National Laboratory (PNNL) since Nov, 2016. He received his bachelor degree from the CS department of Zhejiang University, China, in 2010, and two PhD degrees from the Electrical and Computer Engineering (ECE) department of National University of Singapore (NUS), Singapore, and the Electrical Engineering (EE) department of Eindhoven University of Technology (TU/e), The Netherlands, in 2016. His research has been focusing on software-hardware co-design for scalable heterogeneous HPC, particularly GPUs, since 2009. His research covers full-stack design from circuit level up to architecture, system, library, and applications. He has published in major HPC conferences and journals including SC, ICS, PPoPP, IPDPS, HPDC, ASPLOS, MICRO, HPCA, ICPP, CGO, IISWC, EuroPar, TPDS, TC, ICPE, etc. His lead-author work was nominated for best paper award in SC-15, SC-17, IISWC-18 and SC-20. He received the European HiPEAC paper award, and PNNL's PCSD Outstanding Performance award. He served as organizing committee, PC/ERC, or session chair for major HPC conferences including PPoPP, SC, ASPLOS, ICS, PACT, ISCA, IPDPS, etc. He used to work in industry as a HPC application developer, where he led the evaluation, development, and optimization of several industrial HPC applications. He also worked as a research intern in the INRIA-Lab in Paris-Sud University, France and Chinese University of Hong Kong. 

His research interest includes:
  • Software-Hardware Co-design for HPC accelerators, particularly GPUs, and domain-specific accelerators
  • Performance Modeling and Evaluation for HPC Architecture and Applications
  • Scalable Quantum Circuit Simulation, Transformation and Verification
  • Binarized Neural Network

Service (PC/ERC/SessionChair)

  • 2021: SC, PPoPP, ISCA, MICRO, IPDPS, Cluster, LCTES (with PLDI), TPDS-SS
  • 2020: PPoPP, ISCA, SPAD-BAC, HPCC, TPDS-SS, SC-MLHPC
  • 2019: PPoPP, PACT, NPC, RTSS-AE
  • 2018: PPoPP-AE, NPC, ASPLOS-SRC, ICS, IPDPS
  • Journal Review: TPDS, TOPC, CSUR, DB, JPDC, JSA, CAL, TACO, TNNLS, JCSC, TCAS, MICPRO, FGCS, COSE, CVIU

Publications


2021:
  • [arXiv] "A Bayesian Approach for Characterizing and Mitigating Gate and Measurement Errors"Muqing Zheng, Ang Li, Tamás Terlaky, Xiu Yang [arXiv] (under review)
  • [DATE-21] “AURORA: Automated Refinement of Coarse-Grained Reconfigurable Accelerators”, Cheng Tan, Chenhao Xie, Ang Li, Kevin Barker, Antonino Tumeo, The 2021 Design, Automation & Test in Europe Conference, Grenoble, France. February 1-5, 2021.
2020:
  • [arXiv] "Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures"Chenhao Xie, Jieyang Chen, Jesun S Firoz, Jiajia Li, Shuaiwen Leon Song, Kevin Barker, Mark Raugas, Ang Li [arXiv
  • [arXiv] "A Hybrid System for Learning Classical Data in Quantum States"Samuel A. Stein, Betis Baheri, Ray Marie Tischio, Yiwen Chen, Ying Mao, Qiang Guan, Ang Li, Bo Fang [arXiv] (under review)
  • [arXiv] "ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing"Cheng Tan, Chenhao Xie, Andres Marquez, Antonino Tumeo, Kevin Barker, Ang Li [arXiv] (under review)
  • [arXiv] "QuGAN: A Generative Adversarial Network Through Quantum States"Samuel A. Stein, Betis Baheri, Ray Marie Tischio, Ying Mao, Qiang Guan, Ang Li, Bo Fang, Shuai Xu [arXiv]
  • [TPDS] "Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs"Ang Li and Simon Su, IEEE Transactions on Parallel and Distributed Systems, Special Section on Parallel and Distributed Computing Techniques for AI/ML/DL (To Appear) [arXiv][GitHub][ppt
  • [SC-20 workshop] "QASMBench: An OpenQASM Benchmark Suite for NISQ Evaluation and Simulation"Ang Li, Bo Fang, and Sriram Krishnamoorthy, First International Workshop on Quantum Computing Software (as part of SC-20) (To Appear) 
  • [IISWC-20] "A Sparse Tensor Benchmark Suite for CPUs and GPUs"Jiajia Li, Mahesh Lakshminarasimhan, Xiaolong Wu, Ang Li, Catherine Olschanowsky, and Kevin Barker, 2020 IEEE International Symposium on Workload Characterization, Beijing, China, Oct 27-29, 2020 (To Appear)[arXiv][GitLab].
  • [ICCD-20] "OpenCGRA: An Open-Source Framework for Modeling, Testing, Evaluating CGRAs", Cheng Tan, Chenhao Xie, Ang Li, Kevin Barker, Antonino Tumeo, The 38th IEEE International Conference on Computer Design, Hartford, Connecticut, USA. Oct 18-21, 2020 [pdf][GitHub]
  • [HPEC-20] "CQNN: a CGRA-based QNN Framework", Tong Geng, Chunshu Wu, Cheng Tan, Bo Fang, Ang Li, Martin Herbordt, 2020 IEEE High Performance Extreme Computing Conference, Waltham, MA, USA. Sep 22-24, 2020 [pdf]
  • [HPEC-20] "On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics", Jesun S Firoz, Ang Li, Jiajia Li, Kevin Barker, 2020 IEEE High Performance Extreme Computing Conference, Waltham, MA, USA. Sep 22-24, 2020 [pdf]
  • [TPDS] "O3BNN-R: An Out-Of-Order Architecture for High-Performance and Regularized BNN Inference", Tong Geng, Ang Li, Tianqi Wang, Chunshu Wu, Yanfei Li, Runbin Shi, Wei Wu, and Martin Herbordt, IEEE Transactions on Parallel and Distributed Systems, Volume 32, Issue 1, Aug 3, 2020 [Link]
  • [MICRO-20] "AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing",Tong Geng, Ang Li, Runbin Shi, Tianqi Wang, Yanfei Li, Pouya Haghi, Antonino Tumeo, Shuai Che, Steve Reinhardt, and Martin Herbordt, 53rd IEEE/ACM International Symposium on Microarchitecture, Athens, Greece, Oct 17-21. [arXiv][pdf]
  • [SC-20] "Density Matrix Quantum Circuit Simulation via the BSP Machine on Modern GPU Clusters"Ang Li, Omer Subasi, Xiu Yang, and Sriram Krishnamoorthy, The 2020 International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA. Nov 15-20, 2020 [pdf] [GitHub]
                 Nominated for Best Paper Award!
  • [arXiv] "QASMBench: A Low-level QASM Benchmark Suite for NISQ Evaluation and Simulation"Ang Li and Sriram Krishnamoorthy [arXiv][Github].
  • [ICPP-20] "Detecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines", Pengfei Zou, Ang Li, Kevin Barker, and Rong Ge, International Conference on Parallel Processing, Aug 17-20, Edmonton, AB, Canada, 2020 [pdf][GitHub][ppt]
  • [TC] "FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters", Tianqi Wang, Tong Geng, Ang Li, Xi Jin, and Martin Herbordt, IEEE Transactions on Computers, Volume 69, Issue 8, pp1143-1158, May, 2020 [arXiv][IEEE]
  • [ICS-20] "CSB-RNN: A Super Real-time RNN Framework with Compressed Structured Block", Runbin Shi, Peiyan Dong, Tong Geng, Yuhao Ding, Hayden So, Martin Herbordt, Ang Li, and Yanzhi Wang, The 31st International Conference on SuperComputing, Barcelona, Spain. June 29-July 2, 2020 [pdf].
  • [CCGrid-20] "Indicator-Directed Dynamic Power Management for Iterative Workloads on GPU-Accelerated Systems", Pengfei Zou, Ang Li, Kevin Barker, and Rong Ge, The 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, Melbourne, Australia. May 11-14, 2020 [pdf].
  • [PPoPP-20-Poster] "A Parallel Sparse Tensor Benchmark Suite on CPUs and GPUs"Jiajia Li, Mahesh Lakshminarasimhan, Xiaolong Wu, Ang Li, Cathie Olschanowsky, and Kevin Barker, The 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, USA. Feb 22-26, 2020. [GitLab]
2019:
  • [TPDS] "Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect", Ang Li, Shuaiwen Leon Song, Jieyang Chen, Jiajia Li, Xu Liu, Nathan Tallent, and Kevin Barker, IEEE Transactions on Parallel and Distributed Systems, Volume-31, Issue-1 [arXiv][Link][GitHub]
  • [SC-19-Poster] "Fingerprinting Anomalous Computation with RNN for GPGPU-Based HPC Machines"Pengfei ZouAng Li, Kevin Barker, and Rong Ge, The 2019 International Conference for High Performance Computing, Networking, Storage and Analysis, Denver, CO, USA. Nov 17-22, 2019. 
                 ACM student research competition (SRC) 3rd place winner!
  • [SC-19] "BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets"Ang Li, Tong Geng, Tianqi Wang, Martin Herbordt, Shuaiwen Leon Song, Kevin Barker, The 2019 International Conference for High Performance Computing, Networking, Storage and Analysis, Denver, CO, USA. Nov 17-22, 2019 [pdf] [GitHub] [ppt].
  • [IISWC-19] "Fingerprinting Anomalous Computation with RNN for GPGPU-Based HPC Machines"Pengfei ZouAng Li, Kevin Barker, and Rong Ge. 2019 IEEE International Symposium on Workload Characterization, Orlando, FL, USA, Nov 3-Nov 5, 2019 [pdf]
  • [Springer] "PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite", Jiajia Li, Yuchen Ma, Xiaolong Wu, Ang Li, Kevin Barker, CCF Transactions on High Performance Computing [arXiv][Link]
  • [ASAP-19] "LP-BNN: Ultra-Low-Latency BNN Inference with Layer Parallelism",Tong Geng, Tianqi Wang, Chunshu Wu, Chen Yang, Shuaiwen Leon Song, Ang Li, and Martin Herbordt. The 30th IEEE International Conference on Application-specific Systems, Architectures, and Processors, New York, USA, Jul 15-17, 2019 [pdf]
  • [ICS-19] "O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning",Tong Geng, Tianqi Wang, Chunshu Wu, Chen Yang, Wei Wu, Ang Li, and Martin Herbordt. The 30th International Conference on SuperComputing, Phoenix, AZ, USA, Jun 26-28, 2019 [pdf]
  • [HPCA-19] "PIM-VR: Erasing Motion Anomalies In Highly-Interactive Virtual Reality World With Customized Memory Cube", Chenhao Xie, Xingyao Zhang, Ang Li, Xin Fu, and Shuaiwen Leon Song. The 25th IEEE International Symposium on High-Performance Computer Architecture, Washington D.C., USA, Feb 16-20, 2019 [pdf]
2018:
  • [SC-18-Poster] "Binarized ImageNet Inference in 29us", Tong Geng, Ang Li, Tianqi Wang, Shuaiwen Leon Song, Martin Herbordt, The 2018 International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, TX, USA. Nov 11-16, 2018. 
  • [SC-18-Poster] "Energy Efficiency of Reconfigurable Caches on FPGAs", Tianqi Wang, Ang Li, Tong Geng, Martin Herbordt, The 2018 International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, TX, USA. Nov 11-16, 2018. 
  • [IISWC-18] "Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite"Ang Li, Shuaiwen Leon Song, Jieyang Chen, Xu Liu, Nathan Tallent, and Kevin Barker. 2018 IEEE International Symposium on Workload Characterization, Raleigh, NC, USA, Sep 30-Oct 2, 2018 [pdf][Supplementary File][Github][ppt].
             Nominated for Best Paper Award!
  • [ICS-18] "Warp-Consolidation: A Novel Execution Model for GPUs", Ang Li, Weifeng Liu, Linnan Wang, Kevin Barker, and Shuaiwen Leon Song. The 29th International Conference on SuperComputing, Beijing, China, Jun 12-15, 2018 [pdf][ppt].
  • [CGO-18] "CUDAAdvisor: LLVM-based Runtime Profiling for Modern GPUs", Du Shen, Ang Li, Shuaiwen Leon Song and Xu Liu, International Symposium on Code Generation and Optimization, Vienna, Austria. Feb 24-28, 2018. [pdf][Github]
  • [PPoPP-18] "SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks", Linnan Wang, Jinmian Ye, Yiyang Zhao, Wei Wu, Ang Li, Shuaiwen Leon Song, Zenglin Xu, Tim Kraska, Principles and Practice of Parallel Programming, Wien, Austria. Feb 24-28, 2018. [pdf][Github]
2017:
  • [MICRO-17] "BVF: Enabling Significant On-Chip Power Savings via Bit-Value-Favor for Throughput Processors", Ang Li, Wenfeng Zhao and Shuaiwen Leon Song, The 50th Annual IEEE/ACM International Symposium on Microarchitecture, Boston, MA, USA. Oct 14-18, 2017. [pdf][slides]
  • [SC-17] "Exploring And Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels"Ang Li, Weifeng Liu, Xu Liu, Mads R.B.Kristensen, Brian Vinter, Hao Wang, Kaixi Hou, Andres Marquez and Shuaiwen Leon Song, The 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, Denver, CO, USA. Nov 12-17, 2017. [pdf]
             Nominated for Best Paper Award!   
  • [CCPE] "Fast Synchronization-Free Algorithms for Parallel Sparse Triangular Solves with Multiple Right-Hand Sides", Weifeng Liu, Ang Li, Jonathan Hogg, Iain Duff and Brian Vinter, Concurrency and Computation: Practice and Experience, Wiley. 
  • [ASPLOS-17] "Locality-Aware CTA Clustering for Modern GPUs"Ang Li, Shuaiwen Leon Song, Weifeng Liu, Xu Liu, Akash Kumar and Henk Corporaal, The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Xi'an, China. Apr 8-12, 2017. [pdf][ppt]
  • [ASICON-17] "Analysis and Design of Energy-Efficient Data-Dependent SRAM", Wenfeng Zhao, Ang Li, Yi Wang and Yajun Ha, IEEE 12th International Conference on ASIC,  Guiyang, China, Oct 25-28, 2017. [pdf]
2016:
  • [PhD Thesis] GPU Performance Modeling and Optimization (Oct, 2016) [pdf][ppt]
  • [EuroPar-16] "A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves", Weifeng Liu, Ang Li, Jonathan Hogg, Iain Duff and Brian Vinter, The 22nd International European Conference on Parallel and Distributed Computing, Grenoble, France, Aug 22-26, 2016. [pdf][slides][GitHub]
  • [ICS-16] "SFU-Driven Transparent Approximation Acceleration on GPUs", Ang Li, Shuaiwen Leon Song, Mark Wijtvliet, Akash Kumar and Henk Corporaal, The 27th International Conference on Supercomputing, Istanbul, Turkey, June 1-3, 2016. [pdf][ppt]
  • [IPDPS-16] "X: A Comprehensive Analytic Model for Parallel Machines", Ang Li, Shuaiwen Leon Song, Eric Brugel, Akash Kumar, Daniel Chavarria-Miranda and Henk Corporaal, The 30th IEEE International Parallel & Distributed Processing Symposium, Chicago, Illinois, USA, May 23-27, 2016.  [pdf][ppt]
  • [DATE-16] “Critical Points Based Register-Concurrency Autotuning for GPUs”, Ang Li, Shuaiwen Leon Song, Akash Kumar, Eddy Z. Zhang, Daniel Chavarria and Henk Corporaal, The 2016 Design, Automation & Test in Europe Conference, Dresden, Germany. March 14-18, 2016. [pdf][slides]

2015:
  • [SC-15] “Adaptive and Transparent Cache Bypassing on GPUs”, Ang Li, Gert-Jan Van Den Braak, Akash Kumar and Henk Corporaal, 2015 International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, Texas, USA. November 16-20, 2015. [pdf][Supplementary File][ppt]
              Nominated for Best Paper Award and Best Student Paper Award!       
  • [DSD-15] A Locality Aware Convolutional Neural Networks Accelerator”, Runbin Shi, Zheng Xu, Zhihao Sun, Maurice Peemen, Ang Li, Henk Corporaal, Di Wu, the 18th International Conference on Digital Systems Design, Funchal, Portugal. August 26-28, 2015. [pdf]*
  • [HPDC-15] “Transit: A Visual Analytical Model for Multithreaded Machine”, Ang Li, Akash Kumar, Y.C. Tay and Henk Corporaalthe 24th International Symposium on High-Performance Parallel and Distributed Computing, Portland, Oregon, USA. June 15-19, 2015. [pdf][ppt]
  • [ICS-15] “Fine-Grained Synchronizations and Dataflow Programming on GPUs”, Ang Li, Gert-Jan Van Den Braak, Akash Kumar and Henk CorporaalThe 26th International Conference on Supercomputing, Newport Beach, California, USA. June 8-11, 2015. [pdf][slides]
  • [MICPRO] "Correlation Ratio Based Volume Image Registration on GPUs", Ang Li, Akash Kumar, Yajun Ha and Henk Corporaal,  Microprocssors and Microsystems Journal, vol. 39, no. 8, pp. 998--1011 (2015).
  • [ASPDAC-15] “Accelerating non-volatile/hybrid processor cache design space exploration for application specific embedded systems”, Mohammad Shihabul Haque, Ang Li, Akash Kumar, Qingsong Wei, The 20th Asia and South Pacific Design Automation Conference, Chiba, Japan. January 19-22, 2015. [pdf]
2014:

  • [DSD-14] “Accelerating Volume Image Registration through Correlation Ratio based Methods on GPUs”, Ang Li and Akash Kumar, the 17th International Conference on Digital Systems Design, Verona, Italy. August 27-29, 2014. [pdf]
  • [ISIC-14] “A Heterogeneous Platform with GPU and FPGA for Power Efficient High Performance Computing”,  Qiang Wu, Yajun Ha, Akash Kumar, Shaobo Luo, Ang Li and Shihab Mohamed. The 14th Internatinoal Symposium on Integrated Circuit, Singapore, December 10-12, 2014.



No comments:

Post a Comment