Abstract

Immunoglobulin A nephropathy (IgAN) is the most common primary glomerular disease all over the world and it is a major cause of renal failure. IgAN prediction in children with machine learning algorithms has been rarely studied. We retrospectively analyzed the electronic medical records from the Nanjing Eastern War Zone Hospital, chose eXtreme Gradient Boosting (XGBoost), random forest (RF), CatBoost, support vector machines (SVM), k-nearest neighbor (KNN), and extreme learning machine (ELM) models in order to predict the probability that the patient would not reach or reach end-stage renal disease (ESRD) within five years, used the chi-square test to select the most relevant 16 features as the input of the model, and designed a decision-making system (DMS) of IgAN prediction in children that is based on XGBoost and Django framework. The receiver operating characteristic (ROC) curve was used in order to evaluate the performance of the models and XGBoost had the best performance by comparison. The AUC value, accuracy, precision, recall, and f1-score of XGBoost were 85.11%, 78.60%, 75.96%, 76.70%, and 76.33%, respectively. The XGBoost model is useful for physicians and pediatric patients in providing predictions regarding IgAN. As an advantage, a DMS can be designed based on the XGBoost model to assist a physician to effectively treat IgAN in children for preventing deterioration.

Highlights

  • Immunoglobulin A nephropathy (IgAN) is the most common primary glomerular disease all over the world and it is a major cause of end-stage renal disease (ESRD)

  • Wyatt et al [2] found that the five-year survival rate of children with IgAN was 94–98%, and the 20-year survival rate was 70–89%

  • We evaluated the importance and relevance of predictors with ESRD by the chi-square test for the purpose of identifying significant predictors of ESRD to be applied as inputs for the data mining methods

Read more

Summary

Introduction

Immunoglobulin A nephropathy (IgAN) is the most common primary glomerular disease all over the world and it is a major cause of end-stage renal disease (ESRD). Traditional medical treatment methods rely entirely on doctors’ diagnosis and the treatment of patients In this way, it is difficult to distinguish between diseases with similar symptoms and discover the hidden diseases, leading to misdiagnosis, which may delay the patient’s treatment or endanger the patient’s life. A novel method is proposed for predicting the probability of children patients with IgAN reaching ESRD in five years. EXtreme Gradient Boosting (XGBoost) was adopted in order to predict whether IgAN disease in children patients would reach ESRD or not within five years using a new dataset instead of the traditional clinical pathology. Comparation of the performance of XGBoost with random forest (RF), CatBoost, support vector machines (SVM), k-nearest neighbor (KNN), and extreme learning machine (ELM) was conducted

Dataset
Feature Selection
Performance Evaluation
Findings
System Implementation

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.