Abstract

This chapter walks the reader through a step-by-step guide for building a Machine Learning (ML) model. These steps include but are not limited to data gathering and integration, data cleaning (data visualization, outlier detection, and data imputation), feature ranking and selection, data normalization or standardization, cross-validation (including holdout method, k-fold cross-validation, stratified k-fold cross-validation, leave-P-out cross-validation), and blind set validation. Bias–variance trade-off is also discussed with the visualization illustration for building a successful general ML model. Afterward, various ML types such as supervised, unsupervised, and reinforcement learning are discussed. General information about various types of data centers as well as cloud versus edge computing are also included in this chapter. Next, various algorithms for dimensionality reductions such as principal component analysis (PCA) and nonnegative matrix factorization (NMF) along with step-by-step math and scikit-learn implementation in Python are illustrated. Dimensionality reduction was used for a completions data set to reduce the dimensionality of data from four to two (components) using both PCA and NMF. The clear illustration of the codes can be easily followed to apply the same techniques and algorithms to any other desired data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.