We consider a misspecified optimization problem, requiring the minimization of a function $$f(\cdot;\theta ^*)$$ over a closed and convex set X where $$\theta ^*$$ is an unknown vector of parameters that may be learnt by a parallel learning process. Here, we develop coupled schemes that generate iterates $$(x_k,\theta _k)$$ as $$k \rightarrow \infty$$ , then $$x_k \rightarrow x^*$$ , a minimizer of $$f(\cdot;\theta ^*)$$ over X and $$\theta _k \rightarrow \theta ^*$$ . In the first part of the paper, we consider the solution of problems where f is either smooth or nonsmooth. In smooth strongly convex regimes, we demonstrate that such schemes still display a linear rate of convergence, albeit with larger constants. When strong convexity assumptions are weakened, it can be shown that the convergence in function value sees a modification in the canonical convergence rate of $${{{\mathcal {O}}}}(1/K)$$ by an additive factor proportional to $$\Vert \theta _0-\theta ^*\Vert$$ where $$\Vert \theta _0-\theta ^*\Vert$$ represents the initial misspecification in $$\theta ^*$$ . In addition, when the learning problem is assumed to be merely convex but admits a suitable weak-sharpness property, then the convergence rate deteriorates to $${\mathcal {O}}(1/\sqrt{K})$$ . In both convex and strongly convex regimes, diminishing steplength schemes are also provided and are less reliant on the knowledge of problem parameters. Finally, we present an averaging-based subgradient scheme that displays a rate of $${\mathcal {O}}(1/\sqrt{K})+ \mathcal{O}(\|\theta_0-\theta^*\|(1/K))$$ , implying no effect on the canonical rate of $${{{\mathcal {O}}}}(1/\sqrt{K})$$ . In the second part of the paper, we consider the solution of misspecified monotone variational inequality problems, motivated by the need to contend with more general equilibrium problems as well as the possibility of misspecification in the constraints. In this context, we first present a constant steplength misspecified extragradient scheme and prove its asymptotic convergence. This scheme is reliant on problem parameters (such as Lipschitz constants) and leads to a misspecified variant of iterative Tikhonov regularization, an avenue that does not necessitate the knowledge of such constants.
Read full abstract