Objective The objective of this study is to create predictive models utilizing machine learning algorithms, including Artificial Neural Networks (ANN), k-nearest neighbor (kNN), support vector machines (SVM), and linear regression, to predict critical quality attributes (CQAs) such as hardness, friability, and disintegration time of fast disintegrating tablets (FDTs). Methods A dataset of 864 batches of FDTs was generated by varying binder types and amounts, disintegrants, diluents, punch sizes, and compression forces. Preprocessing steps included normalizing numerical features based on industry standards, one-hot encoding for categorical variables, and addressing outliers to ensure data consistency. Four machine learning models were trained and evaluated on R2 values and mean squared error (MSE). Feature importance was analyzed using permutation importance, and statistical validation (p < 0.05) and confidence intervals were computed for model performance. The ‘differential_evolution’ function was used to optimize the formulation. Results Among the models, ANN demonstrated the highest predictive accuracy, achieving R2 values up to 0.9550 with the lowest MSE across training and test datasets, outperforming kNN, SVM, and linear regression. The ANN’s ability to model complex, non-linear interactions between formulation variables was statistically significant, as validated through six checkpoint batches of acetylsalicylic acid FDTs. The feature importance analysis revealed compression force, binder type, and punch size as the most influential factors affecting hardness, while disintegrant type influenced friability. The ‘differential_evolution’ function effectively optimized the CQAs, resulting in FDTs with ideal characteristics. Conclusion The ANN model, integrated with differential evolution, provided a robust tool for optimizing FDT formulations by accurately predicting CQAs and reducing the need for extensive experimental trials. Compared to traditional optimization methods, ANN excels in capturing intricate multi-variable relationships, making it a valuable approach for scaling beyond acetylsalicylic acid to other formulations. This method enhances the consistency and efficiency of tablet formulation, supporting broader pharmaceutical applications
Read full abstract