Abstract

There are a wide variety of techniques for system identification. A common critical issue in all of these techniques is selecting the appropriate complexity. In particular, system identification algorithms based on two popular clustering techniques i.e., subtractive clustering and c-mean (FCM) clustering, require that the number of underlying partitions be selected ahead of time. That is, if they do not automatically choose the appropriate complexity to model the data. They only find the best-fit model for a given complexity. A model with an overly restricted complexity gives poor predictions on new data, since the model has too little flexibility (yielding high bias and low variance). By contrast, a model with too complexity also gives poor generalization performance since it is too flexible and fits too much of the noise on the training data (yielding low bias but high variance). Bias and variance are complementary quantities, and it is necessary to assign the complexity optimally in order to achieve the best compromise between them. In this paper we propose a general criterion for choosing an appropriate complexity based on a simple resampling approach. We achieve this by deriving a generalization of analytic form of leave-one-out cross-validation risk estimator, which we can then use to determine the optimal complexity of a model in a given data set.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.