Abstract

In drug research and development, in order to save time and cost, the method of establishing compound activity prediction model is usually used to screen potential active compounds, In order to become a candidate drug, a compound not only needs to have good biological activity, but also needs to have good pharmacokinetic properties and safety in human body, which is collectively known as ADMET. This paper adopts data mining technology, Firstly, the use of random forest to find the main variables of modeling is studied, and its independence is verified by high correlation filtering. The 20 main operating variables selected are MDEC-23, maxHsOH etc; Secondly, a five layer BP neural network is used to establish a compound bioactivity prediction model, which can predict the IC50 value and the corresponding pIC50 value of the compound; Then the improved BP neural network model is used to establish the classification prediction model of compounds Caco-2, CYP3A4, ERG, hob and Mn. The algorithm verifies that the accuracy of CYP3A4 is 94.3%, and the accuracy of the five models is more than or close to 90%, which is more practical than the prediction value of the improved BP neural network; Finally, the main variables of genetic algorithm are used to make the compound pair inhibit er α The value range of biological activity is optimized, which has certain practical significance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.