Objective. Seven types of MRI artifacts, including acquisition and preprocessing errors, were simulated to test a machine learning brain tumor segmentation model for potential failure modes. Introduction. Real-world medical deployments of machine learning algorithms are less common than the number of medical research papers using machine learning. Part of the gap between the performance of models in research and deployment comes from a lack of hard test cases in the data used to train a model. Methods. These failure modes were simulated for a pretrained brain tumor segmentation model that utilizes standard MRI and used to evaluate the performance of the model under duress. These simulated MRI artifacts consisted of motion, susceptibility induced signal loss, aliasing, field inhomogeneity, sequence mislabeling, sequence misalignment, and skull stripping failures. Results. The artifact with the largest effect was the simplest, sequence mislabeling, though motion, field inhomogeneity, and sequence misalignment also caused significant performance decreases. The model was most susceptible to artifacts affecting the FLAIR (fluid attenuation inversion recovery) sequence. Conclusion. Overall, these simulated artifacts could be used to test other brain MRI models, but this approach could be used across medical imaging applications.
Read full abstract