Information Extraction System Research Articles

The entire scientific and academic community has been mobilized to gain a better understanding of the COVID-19 disease and its impact on humanity. Most research related to COVID-19 needs to analyze large amounts of data in very little time. This urgency has made Big Data Analysis, and related questions around the privacy and security of the data, an extremely important part of research in the COVID-19 era. The White House OSTP has, for example, released a large dataset of papers related to COVID research from which the research community can extract knowledge and information. We show an example system with a machine learning-based knowledge extractor which draws out key medical information from COVID-19 related academic research papers. We represent this knowledge in a Knowledge Graph that uses the Unified Medical Language System (UMLS). However, publicly available studies rely on dataset that might have sensitive data. Extracting information from academic papers can potentially leak sensitive data, and protecting the security and privacy of this data is equally important. In this paper, we address the key challenges around the privacy and security of such information extraction and analysis systems. Policy regulations like HIPAA have updated the guidelines to access data, specifically, data related to COVID-19, securely. In the US, healthcare providers must also comply with the Office of Civil Rights (OCR) rules to protect data integrity in matters like plasma donation, media access to health care data, telehealth communications, etc. Privacy policies are typically short and unstructured HTML or PDF documents. We have created a framework to extract relevant knowledge from the health centers’ policy documents and also represent these as a knowledge graph. Our framework helps to understand the extent to which individual provider policies comply with regulations and define access control policies that enforce the regulation rules on data in the knowledge graph extracted from COVID-related papers. Along with being compliant, privacy policies must also be transparent and easily understood by the clients. We analyze the relative readability of healthcare privacy policies and discuss the impact. In this paper, we develop a framework for access control decisions that uses policy compliance information to securely retrieve COVID data. We show how policy compliance information can be used to restrict access to COVID-19 data and information extracted from research papers.

Read full abstract

BackgroundGenealogical information, such as that found in family trees, is imperative for biomedical research such as disease heritability and risk prediction. Researchers have used policyholder and their dependent information in medical claims data and emergency contacts in electronic health records (EHRs) to infer family relationships at a large scale. We have previously demonstrated that online obituaries can be a novel data source for building more complete and accurate family trees.ObjectiveAiming at supplementing EHR data with family relationships for biomedical research, we built an end-to-end information extraction system using a multitask-based artificial neural network model to construct genealogical knowledge graphs (GKGs) from online obituaries. GKGs are enriched family trees with detailed information including age, gender, death and birth dates, and residence.MethodsBuilt on a predefined family relationship map consisting of 4 types of entities (eg, people’s name, residence, birth date, and death date) and 71 types of relationships, we curated a corpus containing 1700 online obituaries from the metropolitan area of Minneapolis and St Paul in Minnesota. We also adopted data augmentation technology to generate additional synthetic data to alleviate the issue of data scarcity for rare family relationships. A multitask-based artificial neural network model was then built to simultaneously detect names, extract relationships between them, and assign attributes (eg, birth dates and death dates, residence, age, and gender) to each individual. In the end, we assemble related GKGs into larger ones by identifying people appearing in multiple obituaries.ResultsOur system achieved satisfying precision (94.79%), recall (91.45%), and F-1 measures (93.09%) on 10-fold cross-validation. We also constructed 12,407 GKGs, with the largest one made up of 4 generations and 30 people.ConclusionsIn this work, we discussed the meaning of GKGs for biomedical research, presented a new version of a corpus with a predefined family relationship map and augmented training data, and proposed a multitask deep neural system to construct and assemble GKGs. The results show our system can extract and demonstrate the potential of enriching EHR data for more genetic research. We share the source codes and system with the entire scientific community on GitHub without the corpus for privacy protection.

Read full abstract

Information Extraction System Research Articles

Related Topics

Articles published on Information Extraction System

An annotated corpus of clinical trial publications supporting schema-based relational information extraction

Automated medical chart review for breast cancer outcomes research: a novel natural language processing extraction system

NLP-Based Query-Answering System for Information Extraction from Building Information Models

Natural Language Processing-Assisted Literature Retrieval and Analysis for Combination Therapy in Cancer.

A spatio-temporal emotional framework for knowledge extraction and mining in digital humanities

Design and implementation of information extraction system for scientific literature using fine-tuned deep learning models

Hand Pronation-Supination Movement as a Proxy for Remotely Monitoring Gait and Posture Stability in Parkinson's Disease.

Why does the president tweet this? Discovering reasons and contexts for politicians’ tweets from news articles

Arabic open information extraction system using dependency parsing

A Unified Framework of Medical Information Annotation and Extraction for Chinese Clinical Text

Real-World Actionable Information Extraction System (Aies) for Social Media

From Tokenization to Self-Supervision: Building a High-Performance Information Extraction System for Chemical Reactions in Patents.

Information extraction from scanned invoice images using text analysis and layout features

A precision‐preferred comprehensive information extraction system for clinical articles in traditional Chinese Medicine

Profile generation from web sources: an information extraction system

A Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papers.

Construction of Genealogical Knowledge Graphs From Obituaries: Multitask Neural Network Extraction System.

Extracting Meta Statements from the Blogosphere

MT-clinical BERT: scaling clinical information extraction with multitask learning.

Natural language processing for automated annotation of medication mentions in primary care visit conversations.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Information Extraction System Research Articles

Related Topics

Articles published on Information Extraction System

An annotated corpus of clinical trial publications supporting schema-based relational information extraction

Automated medical chart review for breast cancer outcomes research: a novel natural language processing extraction system

NLP-Based Query-Answering System for Information Extraction from Building Information Models

Natural Language Processing-Assisted Literature Retrieval and Analysis for Combination Therapy in Cancer.

A spatio-temporal emotional framework for knowledge extraction and mining in digital humanities

Design and implementation of information extraction system for scientific literature using fine-tuned deep learning models

Hand Pronation-Supination Movement as a Proxy for Remotely Monitoring Gait and Posture Stability in Parkinson's Disease.

Why does the president tweet this? Discovering reasons and contexts for politicians’ tweets from news articles

Arabic open information extraction system using dependency parsing

A Unified Framework of Medical Information Annotation and Extraction for Chinese Clinical Text

Real-World Actionable Information Extraction System (Aies) for Social Media

From Tokenization to Self-Supervision: Building a High-Performance Information Extraction System for Chemical Reactions in Patents.

Information extraction from scanned invoice images using text analysis and layout features

A precision‐preferred comprehensive information extraction system for clinical articles in traditional Chinese Medicine

Profile generation from web sources: an information extraction system

A Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papers.

Construction of Genealogical Knowledge Graphs From Obituaries: Multitask Neural Network Extraction System.

Extracting Meta Statements from the Blogosphere

MT-clinical BERT: scaling clinical information extraction with multitask learning.

Natural language processing for automated annotation of medication mentions in primary care visit conversations.