Abstract

ABSTRACT: Environmental pollution is a risk factor for chronic diseases (CD), which today are identified as the main cause of death in the world, 80% in low- and middle-income countries, in people of all ages. Given that industrial areas maintain a high rate of air pollution, their inhabitants are considered highly vulnerable, such is the case of the Metropolitan Area of Tula Hgo., in Mexico. At the request of the World Health Organization to integrate vulnerable populations to the quality of life through innovative strategies, the present study aims to build prediction models for CD of higher frequency in the area, to support predictive diagnosis through machine learning algorithms, recognized for their high performance in health areas. Based on the CRISP-DM methodology, requirements, characteristics and data behavior were analyzed, exhaustive cleaning and minmax scaler normalization were performed, the models were trained and validated with 80% - 20% of records, dropout and early stopping were applied to combat overtraining. The comparative analysis between 9 built models demonstrated the best performance of 3 of them, one for each EC; the Artificial Neural Network (ANN) for respiratory diseases and Random Forest (RF) for diabetes and high blood pressure. Its results of accuracy, precision, sensitivity, specificity and F1-score were 99%, 99%, 100%, 99% and 99.49% respectively for ANN, the RF model for diabetes obtained 98%, 100%, 97%, 100% and 98.7% and for arterial hypertension 95%, 97%, 94, 97% and 95.47%, these models were integrated into a graphical interface. The proposal constitutes a high-precision technological strategy for prevention and early diagnosis of CD in industrial areas, aimed at reducing mortality and improving the quality of life of the inhabitants.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call