Abstract
Celiac disease is a common systemic immune-mediated disease caused by an abnormal immune response to gluten proteins, a protein found in grains such as wheat, barley, and rye. The only effective treatment for celiac disease is a lifelong gluten-free diet. This disease has spread worldwide, and its prevalence in the general population is estimated at 1% worldwide. Celiac disease is highly heritable, and its pathogenesis involves gluten antigens presented on the surface of HLA complexes, mainly haplotypes DQ2 and DQ8. However, even if the genetic predisposition shown by these haplotypes is known to be obligatory for celiac disease, it is not sufficient to explain the overall predisposition to the disease. The first step to diagnosing the disease is usually based on serological tests and small bowel biopsy, but due to non-standard serological tests and inappropriate biopsies, the diagnosis of celiac disease is difficult. In addition, the onset of celiac disease includes a wide range of symptoms, which makes early diagnosis of celiac disease very important and vital to prevent long-term complications of these annoying symptoms. For this reason, considering the importance of early diagnosis of this disease, our goal in this study was to apply several machine learning algorithms to train several models and test their performance in predicting celiac disease based on common features and symptoms. This study was conducted on 50 suspected celiac disease samples with an average age of 32 years. 70% of the samples were positive for the disease, and the remaining 30% were negative. The 10-fold cross-validation method was used for training the model. Finally, by using a metaclassifier and the majority vote of all 5 models, including K-Nearest Neighbor, Support Vector Machine, Naive Bayes, Decision Tree, and Random Forest, we were able to achieve an accuracy of 0.8, recall of 0.88, precision of 1, and f-measure of 0.88. The most important features were identified to optimize the prediction performance. The 5 most important features were age, gluten sensitivity, chronic diarrhea, abdominal pain, and lactose intolerance.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have