Abstract
Statistical classification problems are very common in veterinary epidemiology. Traditionally, parametric techniques such as logistic regression or discriminant analysis are used to analyse data sets that contain several classes of observations. However, characteristics of the data set such as high dimensionality, multicollinearity and non-homogeneity can make a data set unsuitable for parametric techniques. In this article, classification tree algorithms (ID3, C4.5, CHAID, CART) and artificial neural networks are suggested as non-parametric alternatives. Their application is illustrated using a field data set containing pig farms with 3 levels of respiratory disease prevalence. The performance of non-parametric classification algorithms is compared with results from multinomial logistic regression. None of the algorithms was significantly better than the others. The proportions of correctly classified farms were between 84% and 96%. However, the data set was small (86 observations), which created technical problems when using the artificial neural networks and multinomial logistic regression. The choice of statistical technique should therefore be based on the objectives of the study and the data set under consideration. Classification trees are well-suited for exploratory data analysis. They are easy to apply and worth considering.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.