Abstract

The data on which a MLP (multilayer perceptron) is normally trained to approximate a continuous function may include inputs that are categorical in addition to the numeric or quantitative inputs. Examples of categorical variables are gender, race, and so on. An approach examined in this article is to train a hybrid network consisting of a MLP and an encoder with multiple output units; that is, a separate output unit for each of the various combinations of values of the categorical variables. Input to the feed forward subnetwork of the hybrid network is then restricted to truly numerical quantities. A MLP with connection matrices that multiply input values and sigmoid functions that further transform values represents a continuous mapping in all input variables. A MLP therefore requires that all inputs correspond to numeric, continuously valued variables and represents a continuous function in all input variables. A categorical variable, on the other hand, produces a discontinuous relationship between an input variable and the output. The way that this problem is often dealt with is to replace the categorical values by numeric ones and treat them as if they were continuously valued. However there is no meaningful correspondence between the continuous quantities generated this way and the original categorical values. The basic difficulty with using these variables is that they define a metric for the categories that may not be reasonable. This suggests that the categorical inputs should be segregated from the continuous inputs as explained above. Results show that the method utilizing a hybrid network and separating numerical from quantitative input, as discussed here, is quite effective. © 2004 Wiley Periodicals, Inc. Int J Int Syst 19: 979–1001, 2004.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call