Abstract
Propositionalization has been proven to be a very effective solution for multi-relational data mining problems. The approaches usually follow a two-step principle: transforming the relational data into a single, flat table and applying a propositional learning algorithm. During the transformation, the target table gets expanded by adding many new features summarizing the information of the non-target tables. Based on the used feature construction strategy, this leads to a table of very high dimensionality with a lot of irrelevant and/or redundant features that can negatively affect the predictive performance. In this paper, we propose a modification of the traditional two-step framework to overcome such problems. The proposed approach evaluates the features during the construction phase and reports only a subset of highly predictive features to the propositional learner. We present an implementation of this approach using a genetic algorithm to search for an optimal feature subset. Our experiments on a number of benchmark datasets suggest that the modified framework can help propositionalization methods to significantly improve their predictive performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Software Engineering and Knowledge Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.