Colorectal cancer (CRC) is a lethal malignancy and a leading cause of cancer-related mortality worldwide. Chromosomal instability (CIN) is a key driver of genomic instability in CRC and is characterized by aneuploidy and somatic copy-number alterations. This study aimed to predict CIN in CRC using histological data from whole slide images (WSIs). CRC samples from TCGA were analyzed, with tumor regions segmented into tiles and nuclei for feature extraction using convolutional neural network (CNN) and morphologic analysis. Binary classification models were developed to distinguish high and low aneuploidy scores (AS) based on slide-level features. The analysis included 313 patients with 315 WSIs, resulting in over 350,000 tumor tiles and nearly 2.7 million tumor cell nuclei. The ResNet18-SSL model, pre-trained on histopathological images, demonstrated superior accuracy in tile-based AS prediction, while DenseNet121 excelled in nucleus-based prediction. Combining CNN-based and morphological features enhanced the classification accuracy of nucleus-based predictions. Additionally, significant correlations were observed between morphological features and copy-number signatures. Unsupervised clustering of nuclear features revealed that distinct groups are significantly correlated with CIN and TP53 mutations. This study underscores the potential of histological features from WSIs to predict CIN in CRC samples. Nuclear feature analysis, combined with deep-learning techniques, offers a robust method for CIN prediction, highlighting the importance of further research into the relationships between histological and molecular phenotypes.
Read full abstract