Abstract

Pre-training lays the foundation for recent successes in radiograph analysis supported by deep learning. It learns transferable image representations by conducting large-scale fully- or self-supervised learning on a source domain; however, supervised pre-training requires a complex and labour-intensive two-stage human-assisted annotation process, whereas self-supervised learning cannot compete with the supervised paradigm. To tackle these issues, we propose a cross-supervised methodology called reviewing free-text reports for supervision (REFERS), which acquires free supervision signals from the original radiology reports accompanying the radiographs. The proposed approach employs a vision transformer and is designed to learn joint representations from multiple views within every patient study. REFERS outperforms its transfer learning and self-supervised learning counterparts on four well-known X-ray datasets under extremely limited supervision. Moreover, REFERS even surpasses methods based on a source domain of radiographs with human-assisted structured labels; it therefore has the potential to replace canonical pre-training methodologies. To train machine learning models for medical imaging, large amounts of training data are needed. Zhou and colleagues instead propose a method of weak supervision which uses the information of radiology reports to learn visual features without the need for expert labelling.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call