Abstract
The traditional approach to curating and disseminating information about agricultural injuries relies heavily on manual input and review, resulting in a labor-intensive process. While the unstructured nature of the material traditionally requires human reviewers, the recent proliferation of Large Language Models (LLMs) has introduced the potential for automation. This study investigates the feasibility and implications of filling the role of a human reviewer with an LLM in analyzing information about agricultural injuries from news articles and investigation reports. Multiple language models were tested for accuracy in extracting relevant incident and victim information, and these models include OpenAI’s ChatGPT 3.5 and 4 and an open-source fine-tuned version of Llama 2. To measure accuracy, each LLM was given prompts to gather relevant data from a set of randomly selected online news articles already cataloged by human reviewers, such as the use of drugs or alcohol, time of day, or other information about the victim(s). Results showed that the fine-tuned Llama2 was the most proficient model with an average accuracy of 93% and some categories reaching 100%. ChatGPT-4 also performed well with around 90% accuracy. Additionally, we found that the fine-tuned Llama2 model was somewhat proficient in coding injuries using the OIICS classification scheme, achieving 48% accuracy when predicting the first digit. Though none of the models are perfectly accurate, the methodology and results prove that LLMs are promising in streamlining workflows in order to reduce human and financial resources and increase the efficiency of data analysis.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have