We evaluated several foundation models in histopathology for image retrieval using a zero-shot approach. These models generated embeddings that were directly employed for retrieval without additional fine-tuning. Our experiments were conducted on diagnostic slides from The Cancer Genome Atlas (TCGA), which covers 23 organs and 117 cancer subtypes. We used Yottixel as the framework for whole-slide image (WSI) retrieval via patch-based embeddings. Retrieval performance was evaluated using macro-averaged F1 scores for top-1, top-3, and top-5 retrievals. The top-5 retrieval F1 scores indicated varying levels of performance: Yottixel-DenseNet (27% ± 13%), Yottixel-UNI (42% ± 14%), Yottixel-Virchow (40% ± 13%), Yottixel-GigaPath (41% ± 13%), and GigaPath WSI (40% ± 14%). These results demonstrate the potential and limitations of foundation models for histopathology image retrieval, underscoring the need for further advancements in embedding and retrieval techniques.
Read full abstract