This study developed a machine learning-based classification model using the Random Forest algorithm to detect cataract risk based on 11 variables: age, gender, family history, lens opacity, visual acuity reduction, light sensitivity, color changes, double vision, intraocular pressure, slit-lamp results, and visual acuity. Feature importance analysis revealed that lens opacity and visual acuity variables contributed most significantly to cataract risk prediction, followed by intraocular pressure and visual acuity reduction. The system was designed using Google Colab for model training and Streamlit as an interactive interface, enabling real-time predictions with intuitive result visualization. After optimization using Grid Search, the model achieved an accuracy of 92.0%, precision of 95.0%, sensitivity of 90.0%, F1 Score of 92.4%, and specificity of 98.0%. This system is expected to serve as an effective supporting tool for medical professionals in the early diagnosis of cataracts.
Read full abstract