Abstract
In a research environment characterized by the five V's of big data, volume, velocity, variety, value, and veracity, the need to develop tools that quickly screen a large number of publications into relevant work is an increasing area of concern, and the data-rich food industry is no exception. Here, a combination of latent Dirichlet allocation and food keyword searches were employed to analyze and filter a dataset of 6102 publications about cold denaturation. After using the Python toolkit generated in this work, the approach yielded 22 topics that provide background and insight on the direction of research in this field, as well as identified the publications in this dataset which are most pertinent to the food industry with precision and recall of 0.419 and 0.949, respectively. Precision is related to the relevance of a paper in the filtered dataset and the recall represents papers which were not identified in the screening method. Lastly, gaps in the literature based on keyword trends are identified to improve the knowledge base of cold denaturation as it relates to the food industry. This approach is generalizable to any similarly organized dataset, and the code is available upon request. Practical Application: A common problem in research is that when you are an expert in one field, learning about another field is difficult, because you may lack the vocabulary and background needed to read cutting edge literature from a new discipline. The Python toolkit developed in this research can be applied by any researcher that is new to a field to identify what the key literature is, what topics they should familiarize themselves with, and what the current trends are in the field. Using this structure, researchers can greatly speed up how they identify new areas to research and find new projects.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.