Abstract
In Gaussian Process, feature importance is inversely proportional to the corresponding length scale when applying the Automatic Relevance Determination (ARD) structured kernel function. Features can be selected by ranking them according to their importance. Among the ARD-based feature selection methods, no uniform score exists for quantifying the output variation explained by feature subsets. This study proposes two feature selection approaches using two cumulative feature importance scores, one titled derivative decomposition ratio and the other normalized sensitivity, to determine the optimal feature subset. The performance of the approaches is assessed to test if irrelevant features are accurately identified and if the feature rankings are correct. The approaches are applied to identify relevant dimensionless inputs for a hybrid model estimating liquid entrainment fraction in two-phase flow. The results reveal that the proposed methods can identify the optimal feature subset for the hybrid model without significantly worsening its Root Mean Squared Error.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.