A feed-forward network for input that is both categorical and quantitative

Roelof K Brouwer

doi:10.1016/s0893-6080(02)00090-4

Abstract

The data on which a multi-layer perceptron (MLP) is to be trained to approximate a continuous function may have inputs that are categorical rather than numeric or quantitative such as color, gender, race, etc. A categorical variable causes a discontinuous relationship between an input variable and the output. A MLP, with connection matrices that multiply input values and sigmoid functions that further transform values, represents a continuous mapping in all input variables. A MLP therefore requires that all inputs correspond to numeric, continuously valued variables and represents a continuous function in all input variables. The way that this problem is usually dealt with is to replace the categorical values by numeric ones and treat them as if they were continuously valued. However, there is no meaningful correspondence between the continuous quantities generated this way and the original categorical values. Another approach is to encode the categorical portion of the input using 1-out-of- n encoding and include this code as input to the MLP. The approach in this paper is to segregate categorical variables from the continuous independent variables completely. The MLP is trained with multiple outputs; a separate output unit for each of the allowed combination of values of the categorical independent variables. During training the categorical value or combination of categorical values determines which of the output units should have the target value on it, with the remaining outputs being ‘do not care’. Three data sets were used for comparison of methods. Results show that this approach is much more effective than the conventional approach of assigning continuous variables to the categorical features. In case of the data set where there were several categorical variables the method proposed here is also more effective than the 1-out-of- n input method.

Full Text