Lipinski’s Rule of Five and Ghose filter are empirical guidelines for evaluating the drug-likeness of a compound, suggesting that orally active drugs typically fall within specific ranges for molecular descriptors such as hydrogen bond donors and acceptors, weight, and lipophilicity. We revisit these practices and offer a more analytical perspective using the Dominance-based Rough Set Approach (DRSA). By analyzing representative samples of drug and non-drug molecules and focusing on the same molecular descriptors, we derived decision rules capable of distinguishing between these two classes systematically and reproducibly. This way, we reduced human bias and enabled efficient knowledge extraction from available data. The performance of the DRSA model was rigorously validated against traditional rules and available machine learning (ML) approaches, showing a significant improvement over empirical rules while achieving comparable predictive accuracy to more complex ML methods. Our rules remain simple and interpretable while being characterized by high sensitivity and specificity.
Read full abstract