You are more or less right. By “mathematical approaches”, we mean approaches focused on building mathematical models relevant to alignment/agency/learning and finding non-trivial theorems (or at least conjectures) about these models. I’m not sure what the word “but” is doing in “but you mention RL”: there is a rich literature of mathematical inquiry into RL. For a few examples, see everything under the bullet “reinforcement learning theory” in the LTA reading list.

You are more or less right. By “mathematical approaches”, we mean approaches focused on building mathematical models relevant to alignment/agency/learning and finding non-trivial theorems (or at least conjectures) about these models. I’m not sure what the word “but” is doing in “but you mention RL”: there is a rich literature of mathematical inquiry into RL. For a few examples, see everything under the bullet “reinforcement learning theory” in the LTA reading list.

Thanks for the pointer! Yes RL has a lot of research of this kind—as an empirical research I just get stuck sometimes in translation