Abstract

Machine Learning (ML) is increasingly being used by companies like Google, Amazon and Apple to help identify market trends and predict customer behavior. Continuous improvement and maturing of these ML tools will help improve decision making across a number of industries. Unfortunately, before many ML strategies can be utilized the methods often require large amounts of data. For a number of realistic situations, however, only smaller subsets of data are available (i.e. hundreds to thousands of points). This work explores this problem by investigating the feasibility of using meta-models, specifically Kriging and Radial Basis Functions, to generate data for training a BN when only small amounts of original data are available. This paper details the meta-model creation process and the results of using Particle Swarm Optimization (PSO) for tuning parameters for four network structures trained using three relatively small data sets. Additionally, a series of experiments augment these small datasets by generating ten thousand, one-hundred thousand, and a million synthetic data points using the Kriging and RBF meta-models as well as intelligently establishing prior probabilities using PSO. Results show that augmenting limited existing datasets with meta-model generated data can dramatically affect network accuracy. Overall, the exploratory results presented in this paper demonstrate the feasibility of using meta-model generated data to increase the accuracy of small sample set trained BN. Further developing this method will help underserved areas with access to only small datasets make use of the powerful predictive analytics of ML.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call