Semi-structured Text Research Articles

Although electronic health records (EHR) provide useful insights into disease patterns and patient treatment optimisation, their reliance on unstructured data presents a difficulty. Echocardiography reports, which provide extensive pathology information for cardiovascular patients, are particularly challenging to extract and analyse, because of their narrative structure. Although natural language processing (NLP) has been utilised successfully in a variety of medical fields, it is not commonly used in echocardiography analysis. To develop an NLP-based approach for extracting and categorising data from echocardiography reports by accurately converting continuous (e.g., LVOT VTI, AV VTI and TR Vmax) and discrete (e.g., regurgitation severity) outcomes in a semi-structured narrative format into a structured and categorised format, allowing for future research or clinical use. 135,062 Trans-Thoracic Echocardiogram (TTE) reports were derived from 146967 baseline echocardiogram reports and split into three cohorts: Training and Validation (n = 1075), Test Dataset (n = 98) and Application Dataset (n = 133,889). The NLP system was developed and was iteratively refined using medical expert knowledge. The system was used to curate a moderate-fidelity database from extractions of 133,889 reports. A hold-out validation set of 98 reports was blindly annotated and extracted by two clinicians for comparison with the NLP extraction. Agreement, discrimination, accuracy and calibration of outcome measure extractions were evaluated. Continuous outcomes including LVOT VTI, AV VTI and TR Vmax exhibited perfect inter-rater reliability using intra-class correlation scores (ICC = 1.00, p < 0.05) alongside high R2 values, demonstrating an ideal alignment between the NLP system and clinicians. A good level (ICC = 0.75-0.9, p < 0.05) of inter-rater reliability was observed for outcomes such as LVOT Diam, Lateral MAPSE, Peak E Velocity, Lateral E' Velocity, PV Vmax, Sinuses of Valsalva and Ascending Aorta diameters. Furthermore, the accuracy rate for discrete outcome measures was 91.38% in the confusion matrix analysis, indicating effective performance. The NLP-based technique yielded good results when it came to extracting and categorising data from echocardiography reports. The system demonstrated a high degree of agreement and concordance with clinician extractions. This study contributes to the effective use of semi-structured data by providing a useful tool for converting semi-structured text to a structured echo report that can be used for data management. Additional validation and implementation in healthcare settings can improve data availability and support research and clinical decision-making.

BackgroundThis paper proposes Cyrus, a new transparency evaluation framework, for Open Knowledge Extraction (OKE) systems. Cyrus is based on the state-of-the-art transparency models and linked data quality assessment dimensions. It brings together a comprehensive view of transparency dimensions for OKE systems. The Cyrus framework is used to evaluate the transparency of three linked datasets, which are built from the same corpus by three state-of-the-art OKE systems. The evaluation is automatically performed using a combination of three state-of-the-art FAIRness (Findability, Accessibility, Interoperability, Reusability) assessment tools and a linked data quality evaluation framework, called Luzzu. This evaluation includes six Cyrus data transparency dimensions for which existing assessment tools could be identified. OKE systems extract structured knowledge from unstructured or semi-structured text in the form of linked data. These systems are fundamental components of advanced knowledge services. However, due to the lack of a transparency framework for OKE, most OKE systems are not transparent. This means that their processes and outcomes are not understandable and interpretable. A comprehensive framework sheds light on different aspects of transparency, allows comparison between the transparency of different systems by supporting the development of transparency scores, gives insight into the transparency weaknesses of the system, and ways to improve them. Automatic transparency evaluation helps with scalability and facilitates transparency assessment. The transparency problem has been identified as critical by the European Union Trustworthy Artificial Intelligence (AI) guidelines. In this paper, Cyrus provides the first comprehensive view of transparency dimensions for OKE systems by merging the perspectives of the FAccT (Fairness, Accountability, and Transparency), FAIR, and linked data quality research communities.ResultsIn Cyrus, data transparency includes ten dimensions which are grouped in two categories. In this paper, six of these dimensions, i.e., provenance, interpretability, understandability, licensing, availability, interlinking have been evaluated automatically for three state-of-the-art OKE systems, using the state-of-the-art metrics and tools. Covid-on-the-Web is identified to have the highest mean transparency.ConclusionsThis is the first research to study the transparency of OKE systems that provides a comprehensive set of transparency dimensions spanning ethics, trustworthy AI, and data quality approaches to transparency. It also demonstrates how to perform automated transparency evaluation that combines existing FAIRness and linked data quality assessment tools for the first time. We show that state-of-the-art OKE systems vary in the transparency of the linked data generated and that these differences can be automatically quantified leading to potential applications in trustworthy AI, compliance, data protection, data governance, and future OKE system design and testing.

Semi-structured Text Research Articles

Related Topics

Articles published on Semi-structured Text

An Intelligent Framework of Equipment Fault Diagnosis Based on Knowledge Graph

Using Large Language Models to Detect Depression From User-Generated Diary Text Data as a Novel Approach in Digital Mental Health Screening: Instrument Validation Study.

Interactive Table Synthesis With Natural Language.

A real-world data analysis of electronic health records to investigate the associations of predominant negative symptoms with healthcare resource utilisation, costs and treatment patterns among patients with schizophrenia

Knowledge graph-derived feed efficiency analysis via pig gut microbiota

A hybrid deep semantic mining method considering fuzzy expressions for the automatic recognition of construction safety hazard information

FedFSA: Hybrid and federated framework for functional status ascertainment across institutions

Extracting named entities from Russian-language documents with different expressiveness of structure

Development and Evaluation of a Natural Language Processing System for Curating a Trans-Thoracic Echocardiogram (TTE) Database.

Applying Natural Language Processing to ClinicalTrials.gov: mRNA cancer vaccine case study.

MMiKG: a knowledge graph-based platform for path mining of microbiota-mental diseases interactions.

Toward AI-supported evaluation for safety control measures against near-miss events in pharmaceutical products

Automatic transparency evaluation for open knowledge extraction systems

진로전환 경험이 있는 상담교사의 성장에 관한 내러티브 탐구

A survey on Relation Extraction

The Medium Is the Definer: Daily Journalism as a Tool for Forming Community: A Case Study—The Ultra-Orthodox Community in Israel

Programming techniques for improving rule readability for rule-based information extraction natural language processing pipelines of unstructured and semi-structured medical texts

Exploring the development of Islamic fintech ecosystem in Indonesia: a text analytics

Application of recommendation system in educational Field

ТЕМАТИЧЕСКОЕ МОДЕЛИРОВАНИЕ И СУММАРИЗАЦИЯ ТЕКСТОВ В ОБЛАСТИ КИБЕРБЕЗОПАСНОСТИ

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Semi-structured Text Research Articles

Related Topics

Articles published on Semi-structured Text

An Intelligent Framework of Equipment Fault Diagnosis Based on Knowledge Graph

Using Large Language Models to Detect Depression From User-Generated Diary Text Data as a Novel Approach in Digital Mental Health Screening: Instrument Validation Study.

Interactive Table Synthesis With Natural Language.

A real-world data analysis of electronic health records to investigate the associations of predominant negative symptoms with healthcare resource utilisation, costs and treatment patterns among patients with schizophrenia

Knowledge graph-derived feed efficiency analysis via pig gut microbiota

A hybrid deep semantic mining method considering fuzzy expressions for the automatic recognition of construction safety hazard information

FedFSA: Hybrid and federated framework for functional status ascertainment across institutions

Extracting named entities from Russian-language documents with different expressiveness of structure

Development and Evaluation of a Natural Language Processing System for Curating a Trans-Thoracic Echocardiogram (TTE) Database.

Applying Natural Language Processing to ClinicalTrials.gov: mRNA cancer vaccine case study.

MMiKG: a knowledge graph-based platform for path mining of microbiota-mental diseases interactions.

Toward AI-supported evaluation for safety control measures against near-miss events in pharmaceutical products

Automatic transparency evaluation for open knowledge extraction systems

진로전환 경험이 있는 상담교사의 성장에 관한 내러티브 탐구

A survey on Relation Extraction

The Medium Is the Definer: Daily Journalism as a Tool for Forming Community: A Case Study—The Ultra-Orthodox Community in Israel

Programming techniques for improving rule readability for rule-based information extraction natural language processing pipelines of unstructured and semi-structured medical texts

Exploring the development of Islamic fintech ecosystem in Indonesia: a text analytics

Application of recommendation system in educational Field

ТЕМАТИЧЕСКОЕ МОДЕЛИРОВАНИЕ И СУММАРИЗАЦИЯ ТЕКСТОВ В ОБЛАСТИ КИБЕРБЕЗОПАСНОСТИ