Pneumonia is a deadly disease affecting millions worldwide, caused by microorganisms and environmental factors. It leads to lung fluid build-up, making breathing difficult, and is a leading cause of death. Early detection and treatment are crucial for preventing severe outcomes. Chest X-rays are commonly used for diagnoses due to their accessibility and low costs; however, detecting pneumonia through X-rays is challenging. Automated methods are needed, and machine learning can solve complex computer vision problems in medical imaging. This research develops a robust machine learning model for the early detection of pneumonia using chest X-rays, leveraging advanced image processing techniques and deep learning algorithms that accurately identify pneumonia patterns, enabling prompt diagnosis and treatment. The research develops a CNN model from the ground up and a ResNet-50 pretrained model This study uses the RSNA pneumonia detection challenge original dataset comprising 26,684 chest array images collected from unique patients (56% male, 44% females) to build a machine learning model for the early detection of pneumonia. The data are made up of pneumonia (31.6%) and non-pneumonia (68.8%), providing an effective foundation for the model training and evaluation. A reduced size of the dataset was used to examine the impact of data size and both versions were tested with and without the use of augmentation. The models were compared with existing works, the model’s effectiveness in detecting pneumonia was compared with one another, and the impact of augmentation and the dataset size on the performance of the models was examined. The overall best accuracy achieved was that of the CNN model from scratch, with no augmentation, an accuracy of 0.79, a precision of 0.76, a recall of 0.73, and an F1 score of 0.74. However, the pretrained model, with lower overall accuracy, was found to be more generalizable.
Read full abstract