Publications (by Years)

2024

Directional Smoothness and Gradient Methods: Convergence and Adaptivity [arxiv]

  • Aaron Mishkin*, Ahmed Khaled*, Yuanhao Wang, Aaron Defazio, Robert M. Gower

  • NeurIPS 2024

2023

Is RLHF More Difficult than Standard RL? [arxiv]

  • Yuanhao Wang, Qinghua Liu, Chi Jin

  • NeurIPS 2023


Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation [arxiv]

  • Yuanhao Wang*, Qinghua Liu*, Yu Bai, Chi Jin

  • COLT 2023

2022

Learning Rationalizable Equilibria in Multiplayer Games [arxiv]

  • Yuanhao Wang*, Dingwen Kong*, Yu Bai, Chi Jin

  • ICLR 2023


Learning markov games with adversarial opponents: Efficient algorithms and fundamental limits [arxiv]

  • Qinghua Liu*, Yuanhao Wang*, Chi Jin

  • ICML 2022 (oral)

2021

V-Learning – A Simple, Efficient, Decentralized Algorithm for Multiagent RL [arxiv]

  • Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu (α-β order)

  • Mathematics of Operations Research, Best Paper in ICLR 2022 “Gamification and Multiagent Solutions” workshop


An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap [arxiv]

  • Yuanhao Wang, Ruosong Wang, Sham M. Kakade

  • NeurIPS 2021 (oral)


Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization [arxiv]

  • Guodong Zhang, Yuanhao Wang, Laurent Lessard, Roger Grosse

  • AISTATS 2022

2020

Online learning in unknown markov games [arxiv]

  • Yi Tian*, Yuanhao Wang*, Tiancheng Yu*, Suvrit Sra

  • ICML 2021


On the suboptimality of negative momentum for minimax optimization[arxiv]

  • Guodong Zhang, Yuanhao Wang

  • AISTATS 2021


Improved Algorithms for Convex-Concave Minimax Optimization [arxiv]

  • Yuanhao Wang, Jian Li

  • NeurIPS 2020

2019 and before

On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach [arxiv]


Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP [arxiv]

  • Yuanhao Wang*, Kefan Dong*, Xiaoyu Chen, Liwei Wang

  • ICLR 2020


Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication [arxiv]

  • Yuanhao Wang*, Jiachen Hu, Xiaoyu Chen, Liwei Wang

  • ICLR 2020


16-qubit IBM universal quantum computer can be fully entangled [paper link][arxiv]

  • Yuanhao Wang, Ying Li, Zhang-qi Yin, Bei Zeng

  • npj Quantum Information


Technical Notes

Robust Linear Regression via Least Squares [pdf]


What is Momentum for Minimax Optimization? [pdf]

  • For quadratic minimization, Chebyshev polynomials can be used to derive Polyak's momentum. A similar tactic for minimax optimization results in a peculiar algorithm.


Non-asymptotic Analysis for Polyak's Momentum in Quadratic Functions [blog post]

  • Proves an explicit finite-time convergence rate for Polyak's momentum


Refined Analysis of FPL for Adversarial Markov Decision Processes [arxiv]

  • Yuanhao Wang, Kefan Dong

  • ICML 2020 Theoretical Foundations for RL Workshop