2024
Directional Smoothness and Gradient Methods: Convergence and Adaptivity [arxiv]
Aaron Mishkin*, Ahmed Khaled*, Yuanhao Wang, Aaron Defazio, Robert M. Gower
NeurIPS 2024
2023
Is RLHF More Difficult than Standard RL? [arxiv]
Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation [arxiv]
2022
Learning Rationalizable Equilibria in Multiplayer Games [arxiv]
Learning markov games with adversarial opponents: Efficient algorithms and fundamental limits [arxiv]
2021
V-Learning – A Simple, Efficient, Decentralized Algorithm for Multiagent RL [arxiv]
Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu (α-β order)
Mathematics of Operations Research, Best Paper in ICLR 2022 “Gamification and Multiagent Solutions” workshop
An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap [arxiv]
Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization [arxiv]
2020
Online learning in unknown markov games [arxiv]
On the suboptimality of negative momentum for minimax optimization[arxiv]
Improved Algorithms for Convex-Concave Minimax Optimization [arxiv]
Yuanhao Wang, Jian Li
NeurIPS 2020
2019 and before
On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach [arxiv]
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP [arxiv]
Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication [arxiv]
16-qubit IBM universal quantum computer can be fully entangled [paper link][arxiv]
Technical Notes
Robust Linear Regression via Least Squares [pdf]
What is Momentum for Minimax Optimization? [pdf]
Non-asymptotic Analysis for Polyak's Momentum in Quadratic Functions [blog post]
Refined Analysis of FPL for Adversarial Markov Decision Processes [arxiv]
|