신웅 (Woong Shin)
Oak Ridge National Laboratory
Ever since beginning the pursuit of achieving exascale HPC, the projected power and energy consumption was considered one of the major challenges. Due to the physical limits of silicon technology, exascale would require an infeasible amount of power and energy to invest and operate, and we had work to do. Fast forward to 2022, now that we have Frontier, the first exascale supercomputer on the floor, what’s next? Are we done? What does energy efficiency mean and how do we support it in this new post-exascale era? In this talk, I will first decompose energy efficiency and its realization at different lifecycles & layers of a supercomputer. Then discuss the status quo, highlighting the current challenges, gaps, and opportunities towards achieving HPC energy efficiency. In particular, motivated by the potential shift of focus that we will see in HPC, backed up with current efforts in OLCF, we will discuss the role of a user facility in this area and layout high-level action items, challenges and opportunities that starts by increasing our awareness of our energy consumption and its impact.
Woong Shin (Ph.D.) is an HPC systems engineer and researcher in the Analytics & AI Methods at Scale (AAIMS) Group at Oak Ridge National Laboratory (ORNL). Since joining ORNL in 2017, he is involved in R&D and engineering activities in developing operational data analytics & AI/ML powered systems for Oak Ridge Leadership Computing Facility (OLCF) systems such Summit and Frontier, providing near term and long term insights with the focus in HPC energy efficiency. His current research interest lies in HPC data center monitoring & analytics, HPC system efficiency & reliability, HPC application power profiling & analysis, large scale data engineering, machine learning workflows and AI/ML application & system design.