Abstract
There are n independent Bernoulli random variables Ik with parameters pk that are observed sequentially. We consider an extension of the last-success-problem with reward wk if the player predicts correctly at step k that Ik=1 is the last success. We establish the optimal strategy for a payoff-function generalizing the last-success 0−1 payoff by using the dynamical programming method. In particular we show that this method is intuitive and very efficient for general payoffs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.