Back
 AJCM  Vol.1 No.3 , September 2011
Conditional Value-at-Risk for Random Immediate Reward Variables in Markov Decision Processes
Abstract: We consider risk minimization problems for Markov decision processes. From a standpoint of making the risk of random reward variable at each time as small as possible, a risk measure is introduced using conditional value-at-risk for random immediate reward variables in Markov decision processes, under whose risk measure criteria the risk-optimal policies are characterized by the optimality equations for the discounted or average case. As an application, the inventory models are considered.
Cite this paper: nullM. Kageyama, T. Fujii, K. Kanefuji and H. Tsubaki, "Conditional Value-at-Risk for Random Immediate Reward Variables in Markov Decision Processes," American Journal of Computational Mathematics, Vol. 1 No. 3, 2011, pp. 183-188. doi: 10.4236/ajcm.2011.13021.
References

[1]   H. M. Markowitz, “Portfolio Selection: Efficient Diversifica-tion of Investment,” Wiley, New York, 1958.

[2]   R. T. Rockafellar and S. Uryasev, “Optimization of Conditional Value-at-Risk,” Journal of Risk, Vol. 2, No. 3, 2000, pp. 21-42.

[3]   R. T. Rockafellar and S. Uryasev, “Conditional Value-at- Risk for General Loss Distributions,” Journal of Banking & Finance, Vol. 26, No. 7, 2002, pp. 1443-1471. doi:10.1016/S0378-4266(02)00271-6

[4]   P. Artzner, F. Del-baen, J. M. Eber and D. Heath, “Coherent Measure of Risk,” Mathematical Finance, Vol. 9, 1999, pp. 203-227. doi:10.1111/1467-9965.00068

[5]   A. Inoue, “On the Worst Conditional Expectation,” Journal on Applied Mathematics, Vol. 286, No. 1, 2003, pp. 237-247.

[6]   S. Kusuoka, “On Law Invariant Coherent Risk Measures,” Advances in Mathe-matical Economics, Vol. 3, Springer, Tokyo, 2001, pp. 83-95.

[7]   H. F?llmer and I. Penner, “Convex Measures of Risk and Trading Constraints,” Finance and Stochastics, Vol. 6, No. 4, 2002, pp. 429-447. doi:10.1007/s007800200072

[8]   H. F?llmer and I. Penner, “Convex Risk Measure and the Dynamics of Their Penalty Functions,” Statistics & Decision, Vol. 24, 2006, pp. 61-96.

[9]   J. Goto and Y. Takano, “Newsvendor Solutions via Conditional Value-at-Risk Minimization,” Euro-pean Journal Operational Research, Vol. 179, No. 1, 2007, pp. 80-96. doi:10.1016/j.ejor.2006.03.022

[10]   A. Takeda, “Generaliza-tion Performance of -Support Vector Classifier Based on Conditional Value-at-Risk Minimization,” Neurocomputing, Vol. 72, 2009, pp. 2351-2358.

[11]   B. King and J. A. Filar, “Time Consistent Dynamic Risk Measures,” Mathematical Methods in Operations Research 2005, Special Issue in Honor of Arice Hordijk 2005, pp. 1-19.

[12]   Y. Ohtsubo and K. Toyonaga, “Optimal Policy for Minimizing Risk Models in Markov Decision Processes,” Journal of Mathematical Analysis and Applications, Vol. 271, No. 1, 2002, pp. 66-81. doi:10.1016/S0022-247X(02)00097-5

[13]   Y. Ohtsubo, “Op-timal Threshold Probability in Discounted Markov Decision Processes with a Target Set,” Applied Mathematics and Com-putation, Vol. 149, No. 2, 2004, pp. 519-532. doi:10.1016/S0096-3003(03)00158-9

[14]   D. J. White, “Minimising a Threshold Probability in Discounted Markov Decision Processes,” Journal of Mathematical Analysis and Applications, Vol. 173, No. 2, 1993, pp. 634-646. doi:10.1006/jmaa.1993.1093

[15]   C. Wu and Y. Lin, “Minimizing Risk Models in Markov Decision Processes with Policies Depending on Target Values,” Journal of Mathematical Analysis and Applications, Vol. 231, No. 1, 1999, pp. 47-67. doi:10.1006/jmaa.1998.6203

[16]   A. P. Mundt, “Dynamic risk management with Markov decision processes,” Universit?ts-verlag Karlsruhe, Karl- sruhe, 2007.

[17]   H. L. Royden, “Real Analysis, Second Edition,” The Macmillan Company, New York, 1968.

[18]   O. Hernández-Lerma and J. B. Lasserre, “Discrete-Time Markov Control Processes, Basic Optimality Criteria,” Springer-Verlag, New York, 1995.

[19]   O. Hernández-Lerma, “Adaptive Markov Control Processes,” Springer-Verlag, New York, 1989.

[20]   M. Kurano, “Markov Decision Processes with a Borel Measurable Cost Function: The Average Case,” Mathematics of Operations Research, Vol. 11, No. 2, 1986, pp. 309-320.

[21]   D. L. Iglehant, “Optimality of (s, S) Policies in the Infinite Horizon Dynamic Inventory Problem,” Management science, Vol. 9, No. 2, 1963, pp. 259-267. doi:10.1287/mnsc.9.2.259

[22]   S. M. Ross, “Applied Probabil-ity Models with Optimization Applications,” Holden-Day, 1970.

 
 
Top