Abstract
We consider a finite controlled Markov chain, the description of which depends on an unknown parameter a, and investigate the following control policy. To each a an optimal stationary control is associated. a is estimated recurrently from the trajectory by the minimum contrast method, and the optimal stationary control corresponding to the estimate is used. We present asymptotic properties of the estimate and of the criterion function. They follow from the law of large numbers and from the central limit theorem for controlled Markov chains derived with the aid of martingales.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.