Abstract

Biosimilar products present a growing opportunity to improve the global healthcare systems. The amount of accepted variability during the comparative assessments of biosimilar products introduces a significant challenge for both the biosimilar developers and the regulatory authorities. The aim of this study was to explore unsupervised machine learning tools as a mathematical aid for the interpretation and visualization of such comparability under control and stress conditions using data extracted from high throughput analytical techniques. For this purpose, a head-to-head analysis of the physicochemical characteristics of three Trastuzumab (TTZ) approved biosimilars and the originator product (Herceptin®) was performed. The studied quality attributes included the primary structure and identity by peptide mapping (PM) with reversed-phase chromatography-UV detection, size and charge profiles by stability-indicating size exclusion and cation exchange chromatography. Stress conditions involved pH and thermal stress. Principal component analysis (PCA) and two of the widely used cluster analysis tools, namely, K-means and Density-based Spatial Clustering of Applications with Noise (DBSCAN), were explored for clustering and feature representation of varied analytical datasets. It has been shown that the clustering patterns delineated by the used algorithms changed based on the included chromatographic profiles. The applied data analysis tools were found effective in revealing patterns of similarity and variability between i) intact and stressed as well as ii) originator and biosimilar samples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call