Abstract

Alcohol dehydrogenases (ADHs) are popular catalysts for synthesizing chiral synthons a vital step for active pharmaceutical intermediate (API) production. They are grouped into three superfamilies namely, medium-chain (MDRs), short-chain dehydrogenase/reductases (SDRs), and iron-containing alcohol dehydrogenases. The former two are used extensively for producing various chiral synthons. Many studies screen multiple enzymes or engineer a specific enzyme for catalyzing a substrate of interest. These processes are resource-intensive and intricate. The current study attempts to decipher the ability to match different ADHs with their ideal substrates using machine learning algorithms. We explore the catalysis of 284 antibacterial ketone intermediates, against MDRs and SDRs to demonstrate a unique pattern of activity. To facilitate machine learning we curated a dataset comprising 33 features, encompassing 4 descriptors for each compound. Subsequently, an ensemble of machine learning techniques viz. Partial Least Squares (PLS) regression, k-Nearest Neighbors (kNN) regression, and Support Vector Machine (SVM) regression, was harnessed. Moreover, the assimilation of Principal Component Analysis (PCA) augmented precision and accuracy, thereby refining and demarcating diverse compound classes. As such, this classification is useful for discerning substrates amenable to diverse alcohol dehydrogenases, thereby mitigating the reliance on high-throughput screening or engineering in identifying the optimal enzyme for specific substrate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call