Abstract Study question Can we decipher the underlying visual properties that drive image-based AI embryo classification models to assist clinical decisions and biological discovery? Summary answer Our framework interpreted which annotated and non-explicitly-annotated phenotypes impact model predictions and rank their importance. These discoveries were aligned with known blastocyst quality criteria. What is known already Deep learning models have shown great promise for complex pattern recognition when applied to embryo images. The success of these models relies on their ability to perform non-linear optimization of feature extraction during model construction. However, this involves their entanglement of multiple classification-driving image properties, thereby producing ‘black-box’ systems that lack user confidence, trust and interpretability. Therefore, there is an urgent need for an interpretability method that can uncover the semantic image properties that contribute to ‘black box’ embryo image-based AI classification model predictions to assist in blastocyst selection. Study design, size, duration 11,211 time-lapse videos were retrospectively collected from three IVF centers. A deep convolutional neural network is first trained to discriminate high-versus-low quality blastocysts. We then developed DISCOVER, a general-purpose interpretability method designed to discover underlying visual properties driving the classifier. DISCOVER encodes an image to an interpretable lower dimensional representation which is correlated to the classifier and encapsulates a different distinct phenotype in each one of the dimensions. Participants/materials, setting, methods The encoding of embryo images to low dimensional representations enables interpretability globally and locally. Globally, the embryo images are synthetically altered by amplifying subtle properties that affect the classification decision. With our method this can be done one property at a time, therefore separating confounding properties. By evaluating the altered images, embryologists can decipher their meaning. Locally, each one of the discovered properties can be ranked by its importance for a specific embryo instance. Main results and the role of chance Using DISCOVER, we interpreted the classification model driving features. We quantitatively linked the top two classification features as blastocyst size (as proxy to degree of expansion and development) and trophectoderm quality, by embryologists evaluation and annotations. We then asked whether DISCOVER can identify non-explicitly annotated latent features that encode morphologic properties not defined by ASEBIR/Gardner criteria. Expert embryologist interpreted the third top classification feature to be the blastocoel. DISCOVER interpreted high quality embryos as having denser and more granular blastocoelic regions, suggesting that this change in the blastocoel appearance is one of the encoded classification-driving morphologic properties. This visualization indicates that there are additional parameters of the blastocoel beyond its volume expansion associated with its quality. We showed how embryo properties can be weighted differently by the classifier on a per embryo basis, giving clinical insight to which properties influence the classification of a specific instance. These results indicate that DISCOVER enables expert-in-the-loop interpretation of the classification model both globally, discovering the overall main properties driving the classifier, and locally, showing a per instance explanation. Limitations, reasons for caution DISCOVER failed to interpret the inner cell mass (ICM) as a classification-driving feature in its latent representation, though it was explicitly used to label the data for training the classification model. It is possible that other properties collectively contained the discriminative information encoded in the ICM. Wider implications of the findings This deep analysis demonstrates the feasibility of providing interpretability for biomedical image-based classification models for clinical use in the IVF clinic. Trial registration number not applicable
Read full abstract