Abstract
This paper addresses the problem of missing data in food composition databases (FCDBs). The missing data can be either for selected foods or for specific components only. Most often, the problem is solved by human experts subjectively borrowing data from other FCDBs, for data estimation or imputation. Such an approach is not only time-consuming but may also lead to wrong decisions as the value of certain components in certain foods may vary from database to database due to differences in analytical methods. To ease missing-data borrowing and increase the quality of missing-data selection, we propose a new computer-based methodology, named MIGHT - Missing Nutrient Value Imputation UsinG Null Hypothesis Testing, that enables optimal selection of missing data from different FCDBs. The evaluation on a subset of European FCDBs, available through EuroFIR and complied with the Food data structure and format standard BS EN 16104 published in 2012, proves that, in more than 80% of selected cases, MIGHT gives more accurate results than techniques currently applied for missing value imputation in FCDBs. MIGHT deals with missing data in FCDBs by introducing rules for missing data imputation based on the idea that proper statistical analysis can decrease the error of data borrowing.
Highlights
In food chemistry, chemical properties and interactions of food components are studied
Focusing on the topic of quality improvement of food composition databases, we present a methodology which can improve the quality of existing FCDBs
MIGHT, deals with the incomplete coverage of foods or nutrients leading to missing data by introducing rules for borrowing data for imputation of missing values from other
Summary
Chemical properties and interactions of food components are studied. Components, called functional chemicals, play an important role in food production and preservation. They can be effectively applied in the treatment and prevention of diseases. FCD is presented as a detailed set of information about the chemical components of foods, providing values for energy, nutrients and other bioactive components of foods (basic data elements), as well as food classifiers and descriptors (metadata). This type of data is available in
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have