Abstract

Decision tree learning constitutes a suitable approach to classification due to its ability to partition the variable space into regions of class-uniform events, while providing a structure amenable to interpretation, in contrast to other methods such as neural networks. But an inherent limitation of decision tree learning is the progressive lessening of the statistical support of the final classifier as clusters of single-class events are split on every partition, a problem known as the fragmentation problem. We describe a software system called DTFE, for Decision Tree Fragmentation Evaluator, that measures the degree of fragmentation caused by a decision tree learner on every event cluster. Clusters are found through a decomposition of the data using a technique known as Spectral Clustering. Each cluster is analyzed in terms of the number and type of partitions induced by the decision tree. Our domain of application lies on the search for single top quark production, a challenging problem due to large and similar backgrounds, low energetic signals, and low number of jets. The output of the machine-learning software tool consists of a series of statistics describing the degree of data fragmentation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call