Abstract

The use of computers as diagnostic aids in medicine is becoming a reality in the clinical arena; a major factor to this trend being the successful application of machine learning techniques. Three fundamentally different approaches to machine learning have been identified, which we call Exemplar, Hyper-plane, and Hyper-rectangle based methods. Part of this thesis is devoted to a novel hyper- rectangle based algorithm called the Multiscale Classifier (MSC), which is implemented as an inductive decision tree. The MSC can be applied to any N-dimensional classification problem, successively splitting feature space in half, using logic minimisation to control tree growth. Pruning techniques are then used to produce decision trees that are sensitive to the misclassification cost of examples. Such techniques are shown to produce different operational modes of classification which may be visualised using the Receiver Operating Characteristic (ROC) curve. The MSC has several significant advantages over other existing hyper-rectangle based approaches: learning is incremental; the tree is non-binary; and backtracking of decisions is possible. A feature extraction technique based on scale-space analysis is proposed and applied to texture measures extracted from images of cervical cell nuclei. Specifically, we model, as a function of scale, features derived from a Grey Level Co-occurrence Matrix (GLCM). On this data set the proposed technique was found to offer an improvement in performance over conventional feature extraction techniques. Methodologies for the evaluation of a number of machine learning algorithms (Bayesian, C4.5, K-NN, Perceptron, Multi-layer Perceptron, and the MSC) are explored using six real world medical diagnostic data sets. The performance of each algorithm is evaluated in terms of overall accuracy, sensitivity, specificity, area under the ROC curve (AUC), X2 test statistic, training time, and interpret ability. For each data set, an Analysis of Variance (ANOVA) is used to test the statistical significance of any differences between the cross-validated estimates of the accuracy and AUC performance measures. The benefits of AUC over accuracy as a performance measure are discussed in terms of increased statistical sensitivity, independence from a decision threshold, and invariance to prior class probabilities. It was found that the exemplar and hyper-plane based methods had marginally higher accuracies when compared to the hyper-rectangle based methods. However, the hyper-rectangle based methods are often more interpretable and less computationally intensive. The MSC was found to compare favourably with the other learning algorithms and has been established as a useful additional tool for machine learning in medical diagnostics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.