Pathological changes or technical artefacts? The problem of the heterogenous databases in COVID-19 CXR image analysis

Marek Socha ,Grzegorz K Przybylski ,Agnieszka Oronowicz-Jaśkowiak ,Andrzej Cieszanowski ,S Mazur ,Justyna Kozub ,Paweł Rajewski ,Magdalena Śliwińska ,Michał Marczyk ,Katarzyna Sznajder ,Wojciech Prażuch ,Robert Flisiak ,Aleksandra Suwalska ,Joanna Tobiasz ,Paweł Foszner ,Damian Piotrowski ,Jerzy Walecki ,Katarzyna Rataj ,Katarzyna Gruszczyńska ,T Popiela ,Gabriela Zapolska,Piotr Blewaska ,Piotr Fiedor ,Piotr Wasilewski ,Joanna Polańska ,Jerzy Jaroszewicz ,Krzysztof Klaude ,Jan Baron ,Krzysztof Simon ,Aleksandr I Tur ,Edyta Szurowska ,Bogumił Gołębiewski ,Mateusz Nowak ,Anna Kozanecka ,Piotr Rabiko ,Mateusz Rataj ,Barbara Giżycka ,Przemyslaw Chmielarz ,Grażyna Drabik ,Katarzyna Krutul-Walenciej ,Małgorzata Pawłowska ,Sebastian Hildebrandt ,Robert Pleśniak

doi:10.1016/j.cmpb.2023.107684

Abstract

BackgroundWhen the COVID-19 pandemic commenced in 2020, scientists assisted medical specialists with diagnostic algorithm development. One scientific research area related to COVID-19 diagnosis was medical imaging and its potential to support molecular tests. Unfortunately, several systems reported high accuracy in development but did not fare well in clinical application. The reason was poor generalization, a long-standing issue in AI development. Researchers found many causes of this issue and decided to refer to them as confounders, meaning a set of artefacts and methodological errors associated with the method. We aim to contribute to this steed by highlighting an undiscussed confounder related to image resolution. Methods20 216 chest X-ray images (CXR) from worldwide centres were analyzed. The CXRs were bijectively projected into the 2D domain by performing Uniform Manifold Approximation and Projection (UMAP) embedding on the radiomic features (rUMAP) or CNN-based neural features (nUMAP) from the pre-last layer of the pre-trained classification neural network. Additional 44 339 thorax CXRs were used for validation. The comprehensive analysis of the multimodality of the density distribution in rUMAP/nUMAP domains and its relation to the original image properties was used to identify the main confounders. ResultsnUMAP revealed a hidden bias of neural networks towards the image resolution, which the regular up-sampling procedure cannot compensate for. The issue appears regardless of the network architecture and is not observed in a high-resolution dataset. The impact of the resolution heterogeneity can be partially diminished by applying advanced deep-learning-based super-resolution networks. ConclusionsrUMAP and nUMAP are great tools for image homogeneity analysis and bias discovery, as demonstrated by applying them to COVID-19 image data. Nonetheless, nUMAP could be applied to any type of data for which a deep neural network could be constructed. Advanced image super-resolution solutions are needed to reduce the impact of the resolution diversity on the classification network decision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer Methods and Programs in Biomedicine	Publication Date: Jun 19, 2023
Citations: 4	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Pathological changes or technical artefacts? The problem of the heterogenous databases in COVID-19 CXR image analysis

Abstract

Talk to us

Similar Papers

More From: Computer Methods and Programs in Biomedicine

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pathological changes or technical artefacts? The problem of the heterogenous databases in COVID-19 CXR image analysis

Abstract

Talk to us

Similar Papers

More From: Computer Methods and Programs in Biomedicine