Abstract

Millions of feed composition records generated annually by testing laboratories are valuable assets that can be used to benefit the animal nutrition community. However, it is challenging to manage, handle, and process feed composition data that originate from multiple sources, lack standardized feed names, and contain outliers. Efficient methods that consolidate and screen such data are needed to develop feed composition databases with accurate means and standard deviations (SD). Considering the interest of the animal science community in data management and the importance of feed composition tables for the animal industry, the objective was to develop a set of procedures to construct accurate feed composition tables from large data sets. A published statistical procedure, designed to screen feed composition data, was employed, modified, and programmed to operate using Python and SAS. The 2.76 million data received from 4 commercial feed testing laboratories were used to develop procedures and to construct tables summarizing feed composition. Briefly, feed names and nutrients across laboratories were standardized, and erroneous and duplicated records were removed. Histogram, univariate, and principal component analyses were used to identify and remove outliers having key nutrients outside of the mean ± 3.5 SD. Clustering procedures identified subgroups of feeds within a large data set. Aside from the clustering step that was programmed in Python to automatically execute in SAS, all steps were programmed and automatically conducted using Python followed by a manual evaluation of the resulting mean Pearson correlation matrices of clusters. The input data set contained 42, 94, 162, and 270 feeds from 4 laboratories and comprised 25 to 30 nutrients. The final database included 174 feeds and 1.48 million records. The developed procedures effectively classified by-products (e.g., distillers grains and solubles as low or high fat), forages (e.g., legume or grass-legume mixture by maturity), and oilseeds versus meal (e.g., soybeans as whole raw seeds vs. soybean meal expellers or solvent extracted) into distinct sub-populations. Results from these analyses suggest that the procedure can provide a robust tool to construct and update large feed data sets. This approach can also be used by commercial laboratories, feed manufacturers, animal producers, and other professionals to process feed composition data sets and update feed libraries.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call