xv01

RLHF

August 4, 2025

https://x.com/xleaps/status/1894560176210149655

Centrifugal governor ≈ the 18-century “human-feedback loop” for steam engines. https://en.wikipedia.org/wiki/Centrifugal_governor

RLHF ≈ the 21-century “human-feedback loop” for language models. https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback

Both solve the same meta-problem:

Desired output – actual output –> error signal –> gain-adjusted correction –> stable useful machine