Q-learning with Flow-Matching Policies

  • Qiyang (Colin) Li | UC Berkeley

Expressive policies such as diffusion and flow-matching policies have recently driven progress in robotic manipulation because they can model complex action distributions and generalize from just a handful of demonstrations. But most are still trained purely with supervised imitation learning. Optimizing them with off-policy reinforcement learning remains challenging, which limits real-world applicability for tasks that require online self-improvement and adaptations. In this talk, I will discuss approaches for making off-policy RL work with flow-matching policies.

Speaker bio

Qiyang (Colin) Li is a PhD student at UC Berkeley advised by Prof. Sergey Levine. His research interests include reinforcement learning and robot learning, with a focus on leveraging offline prior experience for online exploration. Before that, he was an undergraduate student at the University of Toronto advised by Prof. Roger Grosse.