Sumit Kunnumkal
Huseyin Topaloglu
INFORMS Journal on Computing
A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm