We derive an upper bound on the generalization error of classifiers which can be represented as thresholded convex combinations of thresholded convex combinations of functions. Such classifiers include single hidden-layer threshold networks and voted combinations of decision trees (such as those produced by boosting algorithms). The derived bound depends on the proportion of training examples with margin less than some threshold and the average complexity of the combined functions (where the average is over the weights assigned to each function in the convex combination). The complexity of the individual functions in the combination depends on their closeness to threshold. By representing a decision tree as a thresholded convex combination of weighted leaf functions, we apply this result to bound the generalization error of combinations of decision trees. Previous bounds depend on the margin of the combined classifier and the average complexity of the decision trees in the combination, where the complexity of each decision tree depends on the total number of leaves. Our bound also depends on the margin of the combined classifier and the average complexity of the decision trees, but our measure of complexity for an individual decision tree is based on the distribution of training examples over leaves and can be significantly smaller than the total number of leaves.
Read full abstract