Abstract

We describe an efficient implementation (MRDTL-2) of the Multi-relational decision tree learning (MRDTL) algorithm [23] which in turn was based on a proposal by Knobbe et al. [19]. We describe some simple techniques for speeding up the calculation of sufficient statistics for decision trees and related hypothesis classes from multi-relational data. Because missing values are fairly common in many real-world applications of data mining, our implementation also includes some simple techniques for dealing with missing values. We describe results of experiments with several real-world data sets from the KDD Cup 2001 data mining competition and PKDD 2001 discovery challenge. Results of our experiments indicate that MRDTL is competitive with the state-of-the-art algorithms for learning classifiers from relational databases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call