There have been numerous methods for learning and predicting time series ranging from the traditional time-series analyses to recent approaches using neural networks. A central issue common to all of them is the determination of model structure. Both mean prediction error and An Information Criterion (AIC) are useful in model selection; the model with the smallest mean prediction error or AIC is selected from among a set of models as the best one. In this way they give a solution to the problem of model selection. Due to huge search space, however, the mean prediction error or AIC alone is not powerful enough to find the best model structure from among all the candidates. In the present paper the authors propose to use both a structural learning with forgetting and the mean prediction error or AIC to find a model with better generalization ability. Jordan networks and buffer networks, popular in the modeling of time series, are examined in this paper. The structural learning with forgetting and backpropagation (BP) learning are applied to compare the learning and prediction performance of these two types of models. Simulation results demonstrate that the structural learning with forgetting has better generalization ability than BP learning both in Jordan networks and buffer networks.