Robust analysis of discounted Markov decision processes with uncertain transition probabilities

Zhen kai Lou, Fu jun Hou*, Xu ming Lou

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Optimal policies in Markov decision problems may be quite sensitive with regard to transition probabilities. In practice, some transition probabilities may be uncertain. The goals of the present study are to find the robust range for a certain optimal policy and to obtain value intervals of exact transition probabilities. Our research yields powerful contributions for Markov decision processes (MDPs) with uncertain transition probabilities. We first propose a method for estimating unknown transition probabilities based on maximum likelihood. Since the estimation may be far from accurate, and the highest expected total reward of the MDP may be sensitive to these transition probabilities, we analyze the robustness of an optimal policy and propose an approach for robust analysis. After giving the definition of a robust optimal policy with uncertain transition probabilities represented as sets of numbers, we formulate a model to obtain the optimal policy. Finally, we define the value intervals of the exact transition probabilities and construct models to determine the lower and upper bounds. Numerical examples are given to show the practicability of our methods.

    Original languageEnglish
    Pages (from-to)417-436
    Number of pages20
    JournalApplied Mathematics
    Volume35
    Issue number4
    DOIs
    Publication statusPublished - Oct 2020

    Keywords

    • 60J10
    • 90C05
    • 90C40
    • Markov decision processes
    • robust optimal policy
    • robustness and sensitivity
    • uncertain transition probabilities
    • value interval

    Fingerprint

    Dive into the research topics of 'Robust analysis of discounted Markov decision processes with uncertain transition probabilities'. Together they form a unique fingerprint.

    Cite this

    Lou, Z. K., Hou, F. J., & Lou, X. M. (2020). Robust analysis of discounted Markov decision processes with uncertain transition probabilities. Applied Mathematics, 35(4), 417-436. https://doi.org/10.1007/s11766-020-3664-1