Abstract

Requiring only a few relevant characteristics from patients when diagnosing bacterial vaginosis is highly useful for physicians as it makes it less time consuming to collect these data. This would result in having a dataset of patients that can be more accurately diagnosed using only a subset of informative or relevant features in contrast to using the entire set of features. As such, this is a feature selection (FS) problem. In this work, decision tree and Relief algorithms were used as feature selectors. Experiments were conducted on a real dataset for bacterial vaginosis with 396 instances and 252 features/attributes. The dataset was obtained from universities located in Baltimore and Atlanta. The FS algorithms utilized feature rankings, from which the top fifteen features formed a new dataset that was used as input for both support vector machine (SVM) and logistic regression (LR) algorithms for classification. For performance evaluation, averages of 30 runs of 10-fold cross-validation were reported, along with balanced accuracy, sensitivity, and specificity as performance measures. A performance comparison of the results was made between using the total number of features against using the top fifteen. These results found similar attributes from our rankings compared to those reported in the literature. This study is part of ongoing research that is investigating a range of feature selection and classification methods.

Highlights

  • Bacterial vaginosis (BV) is a disease affecting millions of women around the world and involves several serious health conditions [1]

  • We investigated the logistic regression algorithm under the same three scenarios

  • Foster’s algorithm wasRelief and of bacterial were determined selection namely calculated, and a comparison was made against the performance of models created using the entire decision tree algorithms

Read more

Summary

Introduction

Bacterial vaginosis (BV) is a disease affecting millions of women around the world and involves several serious health conditions [1] It is the most common of the vaginal diseases in women of reproductive age and it is associated with preterm delivery, chorioamnionitis, post-abortion infection, pelvic inflammatory disease, and sexually transmitted diseases, such as human papillomavirus (HPV) [2]. This disease can be detected by two clinical procedures: the Amsel criteria and the Nugent score Another procedure to detect VB is named real-time or quantitative polymerase chain reaction (qPCR), which consists of the extraction, isolation, and amplification of DNA microorganisms present in the vaginal tract [3]. From this initial small feature set, the physician forms a differential diagnosis and decides what features

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call