Abstract

Machine learning (ML) has great potential for early diagnosis of disease from medical scans, and at times, has even been shown to outperform experts. However, ML algorithms need large amounts of annotated data – scans with outlined abnormalities - for good performance. The time-consuming annotation process limits the progress of ML in this field. To address the annotation problem, multiple instance learning (MIL) algorithms were proposed, which learn from scans that have been diagnosed, but not annotated in detail. Unfortunately, these algorithms are not good enough at predicting where the abnormalities are located, which is important for diagnosis and prognosis of disease. This limits the application of these algorithms in research and in clinical practice. I propose to use the “wisdom of the crowds” –internet users without specific expertise – to improve the predictions of the algorithms. While the crowd does not have experience with medical imaging, recent studies and pilot data I collected show they can still provide useful information about the images, for example by saying whether images are visually similar or not. Such information has not been leveraged before in medical imaging applications. I will validate these methods on three challenging detection tasks in chest computed tomography, histopathology images, and endoscopy video. Understanding how the crowd can contribute to applications that typically require expert knowledge will allow harnessing the potential of large unannotated sets of data, training more reliable algorithms, and ultimately paving the way towards using ML algorithms in clinical practice.

Highlights

  • Overall aim and key objectives Machine learning (ML) has seen tremendous successes in recent years, for example in classifying everyday objects such as cats in images

  • multiple instance learning (MIL) algorithms are optimized to predict weak annotations [10], but the classifier best at predicting weak annotations, is often not the best at predicting strong annotations [6, 8, 11, 12]. In practice this means that without strong annotations, MIL algorithms are poor at localizing abnormalities [13]

  • The visual information in these annotations can help the MIL algorithm to find better representations for the data via multi-task learning [19] with labels for related tasks, such as outlining airways, and with similarity-based learning

Read more

Summary

Description of the proposed research

Overall aim and key objectives Machine learning (ML) has seen tremendous successes in recent years, for example in classifying everyday objects such as cats in images. The visual information in these annotations can help the MIL algorithm to find better representations for the data via multi-task learning [19] with labels for related tasks, such as outlining airways, and with similarity-based learning [20, 21] with patch similarities The goal of this project is to improve the prediction of strong annotations by MIL, which is important for medical imaging, and applications in other fields where annotations are scarce. I will investigate the applicability of these and similar approaches, while addressing the specifics of crowd-annotated medical images, for example, by weighting the loss function such that the expert labels are given more emphasis. To keep the number of annotations within budget, I will investigate active selection of patches, for example based on their uncertainty or diversity according to the MIL algorithm

Visit Workshop
Knowledge utilisation
Data management plan
Not applicable
Findings
Name of candidate Veronika Cheplygina
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call