Abstract
This paper introduces an approach to increase the accuracy rate of classification by employing Bag-of-Words (BoW) as a feature selection method along with machine learning algorithms to obtain a more accurate output. Because of its capability in quickly processing large sets of data and getting accurate results, this approach can be used in medical areas. Different ensemble approaches are generated by different researchers to obtain good results as mentioned in the literature review. In this study a novel algorithm is proposed to analyze medical kidney test reports, using BoW for selecting the features and analyzing them via Boosting four different machine learning classification algorithms like Sequential Minimum Optimization (SMO), k-Nearest Neighbors (k-NN), Random Forests (RF) and Naive Bayes (NB). With the help of specialists in urology, the proposed algorithm is tested against multiple datasets of different kidney tests. The accuracy of the proposed Boosting algorithms outperforms its counterpart algorithms like SMO, k-NN, RF and NB when they had showen their performances alone.
Highlights
It has been many years that technological advancements play important role in developing human healthcare systems
Classification is done for the UCI, Medya Diagnostic Center (MDC) and ARYO datasets after splitting them into 70% as training set and 30% as testing set, using four different machine learning classification algorithms; Sequential Minimum Optimization (SMO), k-Nearest Neighbors (k-NN), Random Forests (RF) and Naïve Bayes (NB) to classify the data
SMO, k-NN, RF and NB algorithms separately tested against the UCI, MDC and ARYO datasets without employing the BoW as feature selection algorithm
Summary
It has been many years that technological advancements play important role in developing human healthcare systems. Developing electronic medical systems to automatically process clinical diagnosis reports is a significant success. This is because the manual process is slow performing and makes the diagnosis hard for the physicians. About a half-century ago, health care scientists launched the first so-called “Hospital Automation System” that used classification algorithms (Khachidze et al, 2016). After several years of development, still there is no such algorithm with 100% accuracy rate, for this reason, scientists are continuously conducting research to develop and find new algorithms that outperform the current ones. The accuracy of the output is affected by different characteristics, such as data size, feature selection technique, classification algorithm, etc
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.