Article Source
HPC and AI/ML; A Synergistic Relationship
- Speaker: Abhinav Bhatele
- Venue: SPCL_Bcast, recorded on 2nd March, 2023
Abstract
The rapid increase in memory capacity and computational power of modern architectures, especially accelerators, in large data centers and supercomputers has led to a frenzy in training extremely large deep neural networks. However, efficient use of large parallel resources for extreme-scale deep learning requires scalable algorithms coupled with high-performing implementations on such machines. In this talk, Abhinav will first present AxoNN, a parallel deep learning framework that exploits asynchrony and message-driven execution to optimize work scheduling and communication, which are often critical bottlenecks in achieving high performance. Abhinav will also discuss how neural network properties can be exploited for different systems-focused optimizations. On the other hand, recent advances in machine learning approaches are driving scientific discovery across many disciplines, including computer systems and high performance computing. AI/ML can be used to explore the vast quantities of system monitoring data being collected on HPC systems. Abhinav will also present a few examples of using data-driven ML models for performance modeling, forecasting and code generation to highlight how the fields of HPC and AI/ML are coming together, and can help each other.
See https://spcl.inf.ethz.ch/Bcast/ for more talks.