Sponsored by
 
Events
News
 
[ Events ]
Seminars and Talks Conferences, Workshops and Special Events Courses and Lecture Series Taiwan Math. School
 

Activity Search
Sort out
Field
 
Year
Seminars  
 
NCTS Seminar on PDE and Machine Learning
 
12:00 - 13:00, December 13, 2024 (Friday)
Cisco Webex, Online seminar
(線上演講 Cisco Webex)
Convergence of Policy Iteration for Entropy-Regularized Stochastic Control Problems
Yu-Jui Huang (University of Colorado Boulder)

Abstract
For a general entropy-regularized stochastic control problem on an infinite horizon, we prove that a policy iteration algorithm (PIA) converges to an optimal relaxed control. Contrary to the standard stochastic control literature, classical Hölder estimates of value functions do not ensure the convergence of the PIA, due to the added entropy-regularizing term. To circumvent this, we carry out a delicate estimation by moving back and forth between appropriate Hölder and Sobolev spaces. This requires new Sobolev estimates designed specifically for the purpose of policy iteration and a nontrivial technique to contain the entropy growth. Ultimately, we obtain a uniform Hölder bound for the sequence of value functions generated by the PIA, thereby achieving the desired convergence result. Characterization of the optimal value function as the unique solution to an exploratory Hamilton-Jacobi-Bellman equation comes as a by-product. The PIA is numerically implemented in an example of optimal consumption.
 
Meeting number (access code): 2517 526 2404
Meeting password: UsntM5tpp34
 
Organizer: Te-Sheng Lin (NYCU)


 

back to list  
(C) 2021 National Center for Theoretical Sciences