Abstract
Virtual screening-based molecular similarity and fingerprint are crucial in drug design, target prediction, and ADMET prediction, aiding in identifying potential hits and optimizing lead compounds. However, challenges such as lack of comprehensive open-source molecular fingerprint databases and efficient search methods for virtual screening are prevalent. To address these issues, we introduce FaissMolLib, an open-source virtual screening tool that integrates 2.8 million compounds from ChEMBL and ZINC databases. Notably, FaissMolLib employs the highly efficient Faiss search algorithm, outperforming the Tanimoto algorithm in identifying similar molecules with its tighter clustering in scatter plots and lower mean, standard deviation, and variance in key molecular properties. This feature enables FaissMolLib to screen 2.8 million compounds in just 0.05 seconds, offering researchers an efficient, easily deployable solution for virtual screening on laptops and building unique compound databases. This significant advancement holds great potential for accelerating drug discovery efforts and enhancing chemical data analysis. FaissMolLib is freely available at http://liuhaihan.gnway.cc:80. The code and dataset of FaissMolLib are freely available at https://github.com/Superhaihan/FiassMolLib.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have