Abstract

Artificial Neural Networks (ANN) are extensively used to model 'omics' data. Different modeling methodologies and combinations of adjustable parameters influence model performance and complicate model optimization. We evaluated optimization of four ANN modeling parameters (learning rate annealing, stopping criteria, data split method, network architecture) using retention index (RI) data for 390 compounds. Models were assessed by independent validation (I-Val) using newly measured RI values for 1492 compounds. The best model demonstrated an I-Val standard error of 55 RI units and was built using a Ward's clustering data split and a minimally nonlinear network architecture. Use of validation statistics for stopping and final model selection resulted in better independent validation performance than the use of test set statistics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.