Continuous-depth neural networks
- Neural ODE
- Continuous depth flow models
Diffusion models:
- Stochastic differential equations and Itô calculus
- Fokker–Planck equation and reverse-time diffusion
- Diffusion generative models via stochastic differential equations
- Score matching and discrete-time diffusion models
- Conditional generation
- Latent diffusion model
- DALLE 2, Imagen, Stable Diffusion
- Consistency trajectory models
- Note: Mathematical rigor will not be a priority for this course. For diffusion models, however, we will carry out calculations and derivations with stochastic differential equations.
Natural language processing background:
- Sequence models and text preprocessing
- Recurrent neural networks, GRU, and LSTM
- Bidirectional RNN
- Encoder-decoder architecture and machine translation
- Bahdanau attention
- Multi-head attention and transformers
Large language models:
- Instruction finetuning
- Reinforcement leaning with human feedback
- BERT, T5, GPT
- Scaling laws
- In-context learning
- Chain of thought prompting
- Codex
- Parameter-efficient fine tuning (LoRA)
- Hardware-aware models: FlashAttention, QLoRA
- Training data curation and small language models: phi-X
Self-supervised learning:
Vision language models:
- Vision transformer
- CLIP
- BLiP, Flamingo, LLaVA
State-space models:
- Background: orthogonal polynomials, matrix exponential, linear state-space models
- Continuous-time recurrent memory unit
- Background: fast Fourier transform, Woodbury matrix identity
- Structured state-space model
- Background: prefix sum, kernel fusion
- Mamba