Abstract

Principal component analysis (PCA) is a variable reduction method used on over-parameterized data sets with a vast number of variables and a limited number of observations, such as Dairy Herd Improvement (DHI) data, to select subsets of variables that describe the largest amount of variance. Cluster analysis (CA) segregates objects, in this case dairy herds, into groups based upon similarity in multiple characteristics simultaneously. This project aimed to apply PCA to discover the subset of most meaningful DHI variables and to discover groupings of dairy herds with similar performance characteristics. Year 2011 DHI data was obtained for 557 Upper Midwest herds with test-day mean ≥200 cows (assumed mostly freestall housed), that remained on test for the entire year. The PCA reduced an initial list of 22 variables to 16. The average distance method of CA grouped farms based on best goodness of fit determined by the minimum cophenetic distance. Six groupings provided the optimal fitting number of clusters. Descriptive statistics for the 16 variables were computed per group. On observations of means, groups 1, 2, and 6 demonstrated the best performances in most variables, including energy-corrected milk, linear somatic cell score (log of somatic cell count), dry period intramammary infection cure rate, new intramammary infection risk, risk of subclinical intramammary infection at first test, age at first calving, days in milk, and Transition Cow Index. Groups 3, 4, and 5 demonstrated the worst mean performances in most the PCA-selected variables, including DIM, age at first calving, risk of subclinical intramammary infection at first test, and dry period intramammary infection cure rate. Groups 4 and 5 also had the worst mean herd performances in energy-corrected milk, Transition Cow Index, linear somatic cell score, and new intramammary infection risk. Further investigation will be conducted to reveal patterns of management associated with herd categorization. The PCA and CA should be used when describing the multivariate performance of dairy herds and whenever working with over-parameterized data sets, such as DHI databases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.