🎮
Reinforcement Learning
From bandits to deep RL. Policy gradients, Q-learning, actor-critic, RLHF, and real-world applications.
12 chaptersFirst 2 chapters free to previewPremium tier
After this course, you'll be able to:
✓Implement Q-learning, policy gradients, and actor-critic from scratch
✓Train AI from human feedback (RLHF — the technique behind ChatGPT)
✓Build multi-agent RL systems
✓Deploy RL in production applications
Full syllabus
2
Dynamic Programming
3
Monte Carlo Methods
4
Temporal Difference
5
Function Approximation
6
Policy Gradient
7
Advanced Policy Optimization
8
Model-Based RL
9
Multi-Agent RL
10
RLHF
11
Applications
12
Production Deployment
Unlock all 12 chapters
Plus 9 other courses — 252 more chapters included.