Improving Naive Bayes for Regression with Optimized Artificial Surrogate Data

Michael Mayo,Eibe Frank

doi:10.1080/08839514.2020.1726615

Improving Naive Bayes for Regression with Optimized Artificial Surrogate Data

Michael Mayo, Eibe Frank

Open Access

https://doi.org/10.1080/08839514.2020.1726615

Copy DOI

Journal: Applied Artificial Intelligence	Publication Date: Feb 12, 2020
Citations: 10

Affiliation: University of Waikato

#Artificial Data #Data For Machine Learning Algorithms + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

ABSTRACTCan we evolve better training data for machine learning algorithms? To investigate this question we use population-based optimization algorithms to generate artificial surrogate training data for naive Bayes for regression. We demonstrate that the generalization performance of naive Bayes for regression models is enhanced by training them on the artificial data as opposed to the real data. These results are important for two reasons. Firstly, naive Bayes models are simple and interpretable but frequently underperform compared to more complex “black box” models, and therefore new methods of enhancing accuracy are called for. Secondly, the idea of using the real training data indirectly in the construction of the artificial training data, as opposed to directly for model training, is a novel twist on the usual machine learning paradigm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Applied Artificial Intelligence

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.