It is important that the update of model parameters and structure be achieved without a “catastrophic forgetting”. Therefore, a balance between continuous learning and “forgetting” is necessary to deal with non-stationary environments. A considerable body of research has been previously devoted to the design of models (e.g. classifiers, cluster models, time-series models) whose operating environments are supposed to be static—there, the system behavior and operating conditions/modes in static environments does not change over time. However, this is a quite strong restriction as usually the characteristics of systems and environmental conditions change, evolve over time. A typical example of changing environments is the spam detection and filtering. The descriptions of the two classes “spam” and “nonspam” evolve over time due to the changes of user preferences and “spammers” techniques to trick spam classifiers. Another example is the behavior of human beings, which may be affected by different experiences, moods, daily conditions etc.—mimicking their cognitive capabilities in form of neural models thus require a permanent adaptation and regulation of the induced networks. In case of condition monitoring systems employing multi-sensor networks, new sensors may be added whose measured variables or classes should be ideally integrated on-the-fly into a large networks of identified models describing the relations and dependencies being present within the system. In many applications, the models rely on off-line training cycles from historic data samples, which are often costly to obtain or need to be pre-annotated to establish an initial model (e.g. in case of classifiers), especially when intending to guarantee a sufficient coverage of the feature space—issues which are cost-intensive for companies and thus decreasing the applicability and attraction of data-driven models in industrial and healthcare systems. Training the models on The computerization of many life activities and the advances in data collection and storage technology lead to obtain mountains of data. They are collected to capture information about a phenomena or a process behavior. These data are rarely of direct benefit. Thus, a set of techniques and tools are used to extract useful information for decision support, prediction, exploration and understanding of phenomena governing the data sources. Learning methods use historic data points about a process past behavior to build a predictor (classifier, regression model, time-series model). The latter is used as an old experience to predict the process future behavior. However, the predictor needs to adjust itself (self-correction or adaptation) as new events happen or new conditions/system states occur (e.g. during on-line operations). The goal is to ensure an accurate prediction of process behavior according to the changes in new incoming data characteristics. This requires a continuous learning over long period of time with the ability to evolve new structural components on demand and to forget data becoming obsolete and useless. Incremental and sequential learning are essential concepts in order to avoid time-intensive re-training phases and account for the systems dynamics/changing data characteristics with low computational effort and virtual memory usage (enhancing on-line performance). This is because data is processed in sample-wise and single-pass manner.