직함: [seminar] Diffusion Models for Content Creation and Perception
The generative AI revolution is transforming various fields, including visual content creation. In the visual domain, we were only able to generate black-and-white unrealistic human faces 10 years ago, and now, state-of-the-art text-to-image models such as Stable Diffusion can generate diverse and realistic images within a few seconds. We are witnessing a similar revolution in the video generation scheme as well. The primary tool behind these models is Diffusion Models, which exhibit better sample quality and data distribution coverage than other classes of generative models. In this talk, I will introduce my recent work on content creation with Diffusion Models in the 3D / Video / 4D domain. Generative models should understand the properties of the data distribution they are modelling to generate realistic content. In the end, I will talk about how learned representations from Diffusion Models can be extracted and used for perception tasks.
Seung Wook Kim is a senior research scientist at NVIDIA Toronto AI Lab. He recently obtained his PhD from the University of Toronto, advised by Prof. Sanja Fidler. His research focuses on content creation with generative models and exploring their internal representations and capabilities. See his website https://seung-kim.github.io/seungkim/ for more details.