With the advent of computer-aided drug design (CADD), traditional physical testing of thousands of molecules has now been replaced by target-focused drug discovery, where potentially bioactive molecules are predicted by computer software before their physical synthesis. However, despite being a significant breakthrough, CADD still faces various limitations and challenges. The increasing availability of data on small molecules has created a need to streamline the sourcing of data from different databases and automate the processing and cleaning of data into a form that can be used by multiple CADD software applications. Several standalone software packages are available to aid the drug designer, each with its own specific application, requiring specialized knowledge and expertise for optimal use. These applications require their own input and output files, making it a challenge for nonexpert users or multidisciplinary discovery teams. Here, we have developed a new software platform called DataPype, which wraps around these different software packages. It provides a unified automated workflow to search for hit compounds using specialist software. Additionally, multiple virtual screening packages can be used in the one workflow, and if different ways of looking at potential hit compounds all predict the same set of molecules, we have higher confidence that we should make or purchase and test the molecules. Importantly, DataPype can run on computer servers, speeding up the virtual screening for new compounds. Combining access to multiple CADD tools within one interface will enhance the early stage of drug discovery, increase usability, and enable the use of parallel computing.
Read full abstract