Abstract
AbstractAdverse drug reactions/events (ADR/ADE) have a major impact on patient health and health care costs. While most ADR’s are not reported via formal channels, they are often documented in a variety of unstructured conversations such as social media posts or customer support call transcripts. In this paper, we propose a natural language processing (NLP) solution that detects ADR’s in such unstructured free-text conversations, which improves on previous work in three ways. First, a new Named Entity Recognition (NER) model obtains state-of-the-art accuracy for ADR and Drug entity extraction on the ADE, CADEC, and SMM4H benchmark datasets (91.75, 78.76, and 83.41% F1 scores respectively). Second, two new Relation Extraction (RE) models are introduced—one based on BioBERT while the other utilizing crafted features over a Fully Connected Neural Network (FCNN)—perform on par with existing state-of-the-art models, and outperform them when trained with a supplementary clinician-annotated RE dataset. Third, a new text classification model, obtains new state-of-the-art accuracy on the CADEC dataset (86.69% F1 score). The complete solution is implemented as a unified NLP pipeline in a production-grade library built on top of Apache Spark, making it natively scalable for processing millions of records on commodity clusters.KeywordsNLPNERRelation ExtractionPharmacovigilanceSparknlp
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.