Abstract

Weights-of-evidence is the special case of logistic regression, if the predictor variables are conditionally independent indicator variables given the target variable. In this case the contrasts of the weights of evidence are identical to the parameters of the corresponding logistic regression model. If the modeling assumption of conditional independence is not satisfied, application of weights-of-evidence corrupts both the predicted conditional probabilities as well as their rank transforms. On the other hand, a logistic regression model including corresponding interaction terms compensates the lack of conditional independence exactly and is optimum. Thus, logistic regression including interaction terms is the canonical generalization of the naive Bayesian approach assuming conditional independence of all predictor variables given the target variable. Looking at $$2$$ -tuples of the conditional probability of an event and its complement, and replacing the logit transform of logistic regression by the isometric log-ratio transform of compositional statistics, leads to similar compositional regression models which in turn yield very similar numerical results. Artificial neural nets generalize logistic regression by way of nesting regression-like models. Thus they are generally capable to model more involved relationships between predictor variables and the target variable. They are controlled by a lot of parameters, their ultimate characteristic being the topology of the net. Artificial neural nets do not generally provide a measure of confidence in their parameters, in particular they do not feature the concept of statistical significance. Applying the methods mentioned above to a simple example with fabricated data evidences the impact on the predictions and their rank transforms, if the assumption of conditional independence of the predictor variables given the target variable is not satisfied and not taken into account by interaction terms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call