Making clinical decisions based on medical images is fundamentally an exercise in statistical decision-making. This is because in this case, the decision-maker must distinguish between image features that are clinically diagnostic (i.e., signal) from a large amount of non-diagnostic features. (i.e., noise). To perform this task, the decision-maker must have learned the underlying statistical distributions of the signal and noise to begin with. The same is true for machine learning algorithms that perform a given diagnostic task. In order to train and test human experts or expert machine systems in any diagnostic or analytical task, it is advisable to use large sets of images, so as to capture the underlying statistical distributions adequately. Large numbers of images are also useful in clinical and scientific research about the underlying diagnostic process, which remains poorly understood. Unfortunately, it is often difficult to obtain medical images of given specific descriptions in sufficiently large numbers. This represents a significant barrier to progress in the arenas of clinical care, education, and research. Here we describe a novel methodology that helps overcome this barrier. This method leverages the burgeoning technologies of deep learning (DL) and deep synthesis (DS) to synthesize medical images de novo. We provide a proof-of-principle of this approach using mammograms as an illustrative case. During the initial, prerequisite DL phase of the study, we trained a publicly available deep learning neural network (DNN), using open-sourced, radiologically vetted mammograms as labeled examples. During the subsequent DS phase of the study, the fully trained DNN was made to synthesize, de novo, images that capture the image statistics of a given input image. The resulting images indicated that our DNN was able to faithfully capture the image statistics of visually diverse sets of mammograms. We also briefly outline rigorous psychophysical testing methods to measure the extent to which synthesized mammography were sufficiently alike their original counterparts to human experts. These tests reveal that mammography experts fail to distinguish synthesized mammograms from their original counterparts at a statistically significant level, suggesting that the synthesized images were sufficiently realistic. Taken together, these results demonstrate that deep synthesis has the potential to be impactful in all fields in which medical images play a key role, most notably in radiology and pathology.
Read full abstract