ECS8060: AI Engineering
← Back to Schedule

Lecture 12: Preference Optimisation, RLHF, Verifiable Rewards

Lecture 12 · July 22, 2026

Readings

  1. Direct Preference Optimization