Co-Designing Algorithm-Architecture for Fast, Scalable, and Secure Large Language Models

직함: 교수

소속:

KAIST

주최: 유승주 교수

날짜: 2023. 9. 14. 오전 11:00 - 오후 12:00

위치: 302동 209호

요약

Machine learning applications based on large language models (LLMs) have taken the world by storm, widely being deployed in numerous consumer facing products. The success of LLMs has been driven by scaling up both the model capacity and its training dataset, the largest trained model size increasing by over 1,000x over the past 5 years. A critical challenge in sustainably growing model capacity however is its increasingly demanding computation requirements. At the same time, the widespread deployment of LLMs is raising serious concerns on protecting the privacy of user data that are used to train the LLMs. In this talk, I will present some of our on-going work on deploying LLMs in a fast, scalable, and secure manner.

연사 소개

Minsoo Rhu is an Associate Professor at the School of Electrical Engineering at KAIST, conducting research in the intersection of computer architecture and artificial intelligence (AI). Apart from his career in academia, he spent several years in the industry working at NVIDIA and Meta as a Research Scientist. He is an IEEE Senior Member, a member of the Hall of Fame at MICRO (IEEE/ACM International Symposium on Microarchitecture) and HPCA (IEEE International Symposium on High-Performance Computer Architecture), a recipient of the Google Research Scholar Award, Facebook Faculty Research Award, and published over 40 peer-reviewed papers in prestigious venues in the areas of computer systems and VLSI/ASIC circuit designs.

Innovating Memory System Architecture by Breaking Conventions in Academic Research and Industry Practice

SqueezeLLM: Breaking the Memory Wall with Quantization for Fast and Efficient LLM Inference