General parametric forms are assumed for the conditional mean λt(θ0) and variance υt of a time series. These conditional moments can for instance be derived from count time series, Autoregressive Conditional Duration or Generalized Autoregressive Score models. In this paper, our aim is to estimate the conditional mean parameter θ0, trying to be as agnostic as possible about the conditional distribution of the observations. Quasi-Maximum Likelihood Estimators (QMLEs) based on the linear exponential family fulfill this goal, but they may be inefficient and have complicated asymptotic distributions when θ0 contains boundary coefficients. We thus study alternative Weighted Least Square Estimators (WLSEs), which enjoy the same consistency property as the QMLEs when the conditional distribution is misspecified, but have simpler asymptotic distributions when components of θ0 are null and gain in efficiency when υt is well specified. We compare the asymptotic properties of the QMLEs and WLSEs, and determine a data driven strategy for finding an asymptotically optimal WLSE. Simulation experiments and illustrations on realized volatility forecasting are presented.