A confidence Interval-based learning method for stochastic dynamic programs and its applications
日期: 2018-04-10

Abstract: Stochastic dynamic programs find various applications in economics, finance, and operations management. The solution offers insights on how to make decisions in a stochastic environment. However, the traditional Hamilton-Jacobi-Bellman equation based approaches suffer from the “curse of dimensionality” when the spaces of state, randomness, and actions of the problem are all of high dimensions. On numerous occasions people therefore have to rely on approximate heuristic policies to maintain computational tractability. That necessitates the investigation of the following two research problems:

1. How can we assess the quality of a given policy?

2. If we know the performance of a policy is not satisfactory, do we have a systematic way to improve it?

To address these two problems, we employ the information relaxation technique in this paper to develop a method of value iteration to solve SDP. The advantages of the new method are that we can construct valid confidence interval to assess the performance of a heuristic policy and provide a recursive improvement scheme.

Bio: Nan Chen is an associate professor in the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong. His research interests are quantitative methods in finance and risk management, Monte Carlo simulation, and applied probability. He has published in top journals and referred conference proceedings in the fields of operations research and quantitative finance, such as Review of Financial Studies, Operations Research, Mathematics of Operations Research, Mathematical Finance, Finance and Stochastics, Journal of Economic Dynamics and Control.