Abstract
The fuzzy ARTMAP is one of the families of neural network architectures based on ART (adaptive resonance theory) in which supervised learning can be carried out. However, it usually tends to create more categories than are actually needed. This often causes the so-called overfitting problem, where the performance of the fuzzy ARTMAP networks in the test set does not increase monotonically with additional training epochs and category creation. In order to avoid the overfitting problem, Carpenter and Tan (1993) proposed a confidence-based pruning method by eliminating those categories that were either less useful or less accurate. This paper proposes yet another alternative pruning method, which is based on the minimal description length (MDL) principle. The MDL principle can be viewed as a tradeoff between theory complexity and data prediction accuracy, given the theory. We adopted Cameron-Jones's (1992) error encoding scheme and Quinlan's (1994, 1995) modification for theory encoding to estimate the fuzzy ARTMAP theory description length. A greedy MDL search algorithm is proposed to prune the fuzzy ARTMAP categories one by one. Experiments showed that a fuzzy ARTMAP pruned with the MDL principle gave a better performance, with far fewer categories created, than the original fuzzy ARTMAP and other machine-learning systems on a number of benchmark clinical databases such as heart disease, breast cancer and diabetes databases.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have