Universal Lesion Detection Utilising Cascading R-CNNs and a Novel Video Pretraining Method.

Shahin Amiriparian,Daniel Rothenpieler,Maurice Gerczuk,Alexander Kathan,Björn W Schuller,Alexander Meiners

doi:10.1109/embc40787.2023.10340964

Abstract

According to the WHO, approximately one in six individuals worldwide will develop some form of cancer in their lifetime. Therefore, accurate and early detection of lesions is crucial for improving the probability of successful treatment, reducing the need for more invasive treatments, and leading to higher rates of survival. In this work, we propose a novel R-CNN approach with pretraining and data augmentation for universal lesion detection. In particular, we incorporate an asymmetric 3D context fusion (A3D) for feature extraction from 2D CT images with Hybrid Task Cascade. By doing so, we supply the network with further spatial context, refining the mask prediction over several stages and making it easier to distinguish hard foregrounds from cluttered backgrounds. Moreover, we introduce a new video pretraining method for medical imaging by using consecutive frames from the YouTube VOS video segmentation dataset which improves our model's sensitivity by 0.8 percentage points at a false positive rate of one false positive per image. Finally, we apply data augmentation techniques and analyse their impact on the overall performance of our models at various false positive rates. Using our introduced approach, it is possible to increase the A3D baseline's sensitivity by 1.04 percentage points in mFROC.

Full Text