Generative AI and Foundation Models

Generative AI and Foundation Models, Spring 2024

This is the public course website for Generative AI and Foundation Models, Spring 2024.

On Tuesday, 05/28, 5:00–6:35 pm, I will give a talk entitled "세상을 바꿀 LLM의 사고력" in the Mathematics department's Gauss Colloquium. Those of you who are interested are encouraged to attend.
No lecture on 06/07
We will start taking in-person attendance (for regular lectures) 03.15. We will not take attendance for make-up lectures. Note that attendance is a significant part of the course grade.

Continuous-depth neural networks

Diffusion models:

Stochastic differential equations and Itô calculus
Fokker–Planck equation and reverse-time diffusion
Diffusion generative models via stochastic differential equations
Score matching and discrete-time diffusion models
Conditional generation
Latent diffusion model
DALLE 2, Imagen, Stable Diffusion
Consistency trajectory models
Note: Mathematical rigor will not be a priority for this course. For diffusion models, however, we will carry out calculations and derivations with stochastic differential equations.

Natural language processing background:

Large language models:

Self-supervised learning:

Vision language models:

State-space models:

Background: orthogonal polynomials, matrix exponential, linear state-space models
Continuous-time recurrent memory unit
Background: fast Fourier transform, Woodbury matrix identity
Structured state-space model
Background: prefix sum, kernel fusion
Mamba

Course material will be posted on this website. eTL will be used for announcements, homework submission, and receiving homework and exam scores.

Fridays 9:00–11:59am, 43-101.

This class will have an in-person final exam.

Attendance 30%, homework 30%, final exam 40%.

Good knowledge of the following subjects is required.

Basics of deep neural network architecturs at the level of ResNet.
Basics of deep neural network training at the level of SGD, Adam, and BatchNorm.
Basic ODEs: Initial value problem.
Probability theory at the level of conditional expectations and multi-variate Gaussians. Prior exposure to SDEs is not a prerequisite.