This is the public course website for *Generative AI and Foundation Models*, Spring 2024.

- On Tuesday, 05/28, 5:00–6:35 pm, I will give a talk entitled "세상을 바꿀 LLM의 사고력" in the Mathematics department's Gauss Colloquium. Those of you who are interested are encouraged to attend.
- No lecture on 06/07
- We will start taking in-person attendance (for regular lectures) 03.15. We will not take attendance for make-up lectures. Note that attendance is a significant part of the course grade.

- Diffusion 0: Neural ODE and Continuous-Depth Flow Models.
- Diffusion 1: Reverse Time SDE, code.
- Diffusion 2: Training via Score Matching.
- Diffusion 3: Discrete Time Diffusion Models.
- Diffusion 4: Conditional Generation I.
- Diffusion 5: Text-Guided Diffusion Models.
- NLP Basics.
- Large Language Models.
- Backpropagation and Hardware Aware AI.
- State-Space Models.
- Vision Language Models.

- Homework 1, Due 04/15, 5pm.
- Homework 2, Due 05/10, 5pm.
- Homework 3, Due 06/21, 5pm.

Continuous-depth neural networks

- Neural ODE
- Continuous depth flow models

Diffusion models:

- Stochastic differential equations and Itô calculus
- Fokker–Planck equation and reverse-time diffusion
- Diffusion generative models via stochastic differential equations
- Score matching and discrete-time diffusion models
- Conditional generation
- Latent diffusion model
- DALLE 2, Imagen, Stable Diffusion
- Consistency trajectory models
- Note: Mathematical rigor will not be a priority for this course. For diffusion models, however, we will carry out calculations and derivations with stochastic differential equations.

Natural language processing background:

- Sequence models and text preprocessing
- Recurrent neural networks, GRU, and LSTM
- Bidirectional RNN
- Encoder-decoder architecture and machine translation
- Bahdanau attention
- Multi-head attention and transformers

Large language models:

- Instruction finetuning
- Reinforcement leaning with human feedback
- BERT, T5, GPT
- Scaling laws
- In-context learning
- Chain of thought prompting
- Codex
- Parameter-efficient fine tuning (LoRA)
- Hardware-aware models: FlashAttention, QLoRA
- Training data curation and small language models: phi-X

Self-supervised learning:

Vision language models:

- Vision transformer
- CLIP
- BLiP, Flamingo, LLaVA

State-space models:

- Background: orthogonal polynomials, matrix exponential, linear state-space models
- Continuous-time recurrent memory unit
- Background: fast Fourier transform, Woodbury matrix identity
- Structured state-space model
- Background: prefix sum, kernel fusion
- Mamba

Course material will be posted on this website. eTL will be used for announcements, homework submission, and receiving homework and exam scores.

Ernest K. Ryu, 27-205,

Fridays 9:00–11:59am, 43-101.

This class will have an in-person final exam.

- Final exam: Wednesday, 06/26, 9:00am–1:00pm, location TBD.

Attendance 30%, homework 30%, final exam 40%.

Good knowledge of the following subjects is required.

- Basics of deep neural network architecturs at the level of ResNet.
- Basics of deep neural network training at the level of SGD, Adam, and BatchNorm.
- Basic ODEs: Initial value problem.
- Probability theory at the level of conditional expectations and multi-variate Gaussians. Prior exposure to SDEs is not a prerequisite.