Abstract
We propose a virtual screening method based on imbalanced data mining in this paper, which combines virtual screening techniques with imbalanced data classification methods to improve the traditional virtual screening process. First, in the actual virtual screening process, we apply k-means and smote heuristic oversampling method to deal with imbalanced data. Meanwhile, to enhance the accuracy of the virtual screening process, a particle swarm optimization algorithm is introduced to optimize the parameters of the support vector machine classifier, and the concept of ensemble learning is brought in. The classification technique based on particle swarm optimization, support vector machine and adaptive boosting is used to screen the molecular docking conformation to improve the accuracy of the prediction. Finally, in the experimental construction and analysis section, the proposed method was validated using relevant data from the protein data bank database and PubChem database. The experimental results indicated that the proposed method can effectively improve the accuracy of virus screening and has practical guidance for new drug development. This research regards virtual screening as a problem of imbalanced data classification, which has obvious guiding significance and also provides a certain reference for the problems faced by virtual screening technology.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have