Abstract

Many approximate iterative algorithms can be represented by the form $V_n=TV_{n-1}+U_n$, $n\geq1$, where $V_{n-1},U_n$, $n\geq1$, are elements of a seminormed linear space $(\mathcal{V},\|\cdot\|)$ and $T$ is a contractive operator. The objective is usually to calculate a fixed point $V^*$ of $T$. Here, the quantities $U_n$ may be interpreted as the errors obtained when the operator $T$ can be evaluated only approximately. As is well known, in the absence of such an approximation error, the algorithm converges to a fixed point with a geometric rate under general conditions. In this article the convergence properties of such algorithms in the presence of an approximation error $U_n$ are studied. It is shown that the error $\|V_n-V^*\|$ is dominated by $\|U_n\|$, so that convergence of $\|U_n\|$ to zero implies the convergence of $\|V_n-V^*\|$ to zero, at a rate determined by $\|U_n\|$. The results are naturally extended to a relative error of the form $\|U_n\|/\|V_{n-1}\|$, as well as to $J$-stage contractions. The utility of this general theory is then demonstrated by an extended application to the problem of model-based approximate and adaptive control of Markov decision processes. The theory is shown to permit a sharpening of known convergence rates under more general conditions. Additionally, bounds on regret for adaptive controls with forced exploration are calculated in terms of a stagewise exploration rate. This permits the determination of an optimal choice of exploration rate within the class of certainty-equivalence adaptive control policies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.