Abstract

Background: In public health research, there is currently a need to close the gap between care delivery and cohort identification. We need dedicated tagging staff to allocate a considerable amount of effort to assigning clinical codes after reading patient summaries. Machine learning automation can facilitate the classification of these clinical narratives, but sufficient availability of electronic medical records is still a bottleneck. Veterinary medical records represent a largely untapped data source that could be used to benefit both human and non-human patients. Very few approaches utilizing veterinary data sources currently exist. Methods: In this retrospective cross-sectional and chart review study, we trained separate long short-term memory (LSTM) Recurrent Neural Networks (RNNs) on 52,722 human records and 89,591 veterinary records, tested the models' efficacy in a standard train-test split setup, and probed the portability of these models across species domains. We trained versions of our models using first the free-text clinical narratives, and then only using extracted clinically relevant terms from MetaMap Lite, a natural language processing tool intended for this purpose. Findings: We show that our LSTM approach correctly classifies across toplevel codes in the veterinary records (F1 score =0·83), and identifies top-level neoplasia records in veterinary records (F1 score = 0·93). The model trained with veterinary data can be ported over to identify neoplasia records in the human records (F1 score = 0·70). Interpretation: Our findings suggest that free-text clinical narratives can be used to learn classification models that allow the rapid identification of patient cohorts. Ultimately, this effort can lead to new insights that can address emerging public health concerns. Digitization of health information will continue to be a reality in both human and veterinary data; our approach serves as first proof-of-concept regarding how these two domains can learn from, and inform, one another. Funding Statement: Stanford University & The Chan Zuckerberg Biohub Investigator Award Declaration of Interests: CDB is Principal and Chairman of CDB Consulting LTD. He has advised Imprimed, Embark Vet and Etalon DX as a member of their respective Scientific Advisory Boards, and is a Director of Etalon DX. The remaining authors declare no conflicts of interest. Ethics Approval Statement: This research was reviewed and approved by Stanford’s Institutional Review Board (IRB), which provided a nonhuman subject determination under eProtocol 46979. Consent was not required.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.