Abstract

Propositionalization has been proven to be a very effective solution for multi-relational data mining problems. The approaches usually follow a two-step principle: transforming the relational data into a single, flat table and applying a propositional learning algorithm. During the transformation, the target table gets expanded by adding many new features summarizing the information of the non-target tables. Based on the used feature construction strategy, this leads to a table of very high dimensionality with a lot of irrelevant and/or redundant features that can negatively affect the predictive performance. In this paper, we propose a modification of the traditional two-step framework to overcome such problems. The proposed approach evaluates the features during the construction phase and reports only a subset of highly predictive features to the propositional learner. We present an implementation of this approach using a genetic algorithm to search for an optimal feature subset. Our experiments on a number of benchmark datasets suggest that the modified framework can help propositionalization methods to significantly improve their predictive performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.