Imaging is commonly used as a characterization method in the pharmaceuticals industry, including for quantifying subvisible particles in solid and liquid formulations. Extracting information beyond particle size, such as classifying morphological subpopulations, requires some type of image analysis method. Suggested methods to classify particles have been based on pre-determined morphological features or use supervised training of convolutional neural networks to learn image representations in relation to ground truth labels. Complications arising from highly complex morphologies, unforeseen classes, and time-consuming preparation of ground truth labels, are some of the challenges faced by these methods. In this work, we evaluate the application of a self-supervised contrastive learning method in studying particle images from therapeutic solutions. Unlike with supervised training, this approach does not require ground truth labels and representations are learned by comparing particle images and their augmentations. This method provides a fast and easily implementable tool of coarse screening for morphological attribute assessment. Furthermore, our analysis shows that in cases with relatively balanced datasets, a small subset of an image dataset is sufficient to train a convolutional neural network encoder capable of extracting useful image representations. It is also demonstrated that particle classes typically observed in protein solutions administered by pre-filled syringes emerge as separated clusters in the encoder's embedding space, facilitating performing tasks such as training weakly-supervised classifiers or identifying the presence of new subpopulations.
Read full abstract