Abstract
An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.
Highlights
In nutritional epidemiology, detailed food information is required to help dietitians to evaluate the eating behavior of participants
Similar to the method used in previous works of point cloud completion [20,31,32], the holdout method, a simple kind of cross validation, was used to evaluate the performance of the model in which 70% (20 k images per object item) of the rendered depth images were used to train the neural network, while 10% (2.85 k images per object item) and 20% (5.71 k images per object item) of the images with unseen viewing angles were selected as the validation dataset and testing dataset, respectively
An integrated approach based on the depth-sensing technique and deep learning view synthesis is proposed to enable accurate food volume estimation with a single depth image taken in any convenient angles
Summary
In nutritional epidemiology, detailed food information is required to help dietitians to evaluate the eating behavior of participants. 24-hour dietary recall (24HR), a dietary assessment method, is commonly used to capture information on the food eaten by the participants. A complete dietary assessment procedure can mainly be divided into four parts: food identification, portion size estimation, nutrient intake calculations, and dietary analysis. When using 24HR, the volume or the portion size of object items relies heavily on participants’ subjective judgement which undoubtedly leads to inaccurate and biased dietary assessment results. It is essential to develop objective dietary assessment techniques to address the problems of inaccuracy and subjective measurements. A variety of computer vision-based techniques have been proposed to tackle the problem of quantifying food portions. The food volume measurement techniques can be divided into two main categories: model-based and stereo-based techniques. Sun et al [3] proposed a virtual reality (VR) approach which utilizes pre-built 3D food models with known volumes for users
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.