This is the public course website for *Generative AI and Foundation Models*, Spring 2024.

- This course meets once a week, and I will talk about the recent advancements in generative AI (image and text) that shook up the world.
- First lecture will be on Friday 03.08 9:00–11:59am, 43-101.
- We will have a make-up lecture on Saturday 03.09 9:00–11:59am, 28-101.

- Homework 1, Deadline TBD.

Continuous-depth neural networks

- Neural ODE
- Continuous depth flow models

Diffusion models:

- Stochastic differential equations and Itô calculus
- Fokker–Planck equation and reverse-time diffusion
- Diffusion generative models via stochastic differential equations
- Score matching and discrete-time diffusion models
- Conditional generation
- Latent diffusion model
- DALLE 2, Imagen, Stable Diffusion
- Consistency trajectory models
- Note: Mathematical rigor will not be a priority for this course. For diffusion models, however, we will carry out calculations and derivations with stochastic differential equations.

Natural language processing background:

- Sequence models and text preprocessing
- Recurrent neural networks, GRU, and LSTM
- Bidirectional RNN
- Encoder-decoder architecture and machine translation
- Bahdanau attention
- Multi-head attention and transformers

Large language models:

- Instruction finetuning
- Reinforcement leaning with human feedback
- BERT, T5, GPT
- Scaling laws
- In-context learning
- Chain of thought prompting
- Codex
- Parameter-efficient fine tuning (LoRA)
- Hardware-aware models: FlashAttention, QLoRA
- Training data curation and small language models: phi-X

Self-supervised learning:

Vision language models:

- Vision transformer
- CLIP
- BLiP, Flamingo, LLaVA

State-space models:

- Background: orthogonal polynomials, matrix exponential, linear state-space models
- Continuous-time recurrent memory unit
- Background: fast Fourier transform, Woodbury matrix identity
- Structured state-space model
- Background: prefix sum, kernel fusion
- Mamba

Course material will be posted on this website. eTL will be used for announcements, homework submission, and receiving homework and exam scores.

Ernest K. Ryu, 27-205,

Fridays 9:00–11:59am, 43-101.

Attendance 30%, homework 30%, final exam 40%.

Good knowledge of the following subjects is required.

- Basics of deep neural network architecturs at the level of ResNet.
- Basics of deep neural network training at the level of SGD, Adam, and BatchNorm.
- Basic ODEs: Initial value problem.
- Probability theory at the level of conditional expectations and multi-variate Gaussians. Prior exposure to SDEs is not a prerequisite.