Privacy Regulations Research Articles

Abstract A vast amount of clinical data are still stored in unstructured text. Automatic extraction of medical information from these data poses several challenges: high costs of clinical expertise, restricted computational resources, strict privacy regulations, and limited interpretability of model predictions. Recent domain adaptation and prompting methods using lightweight masked language models showed promising results with minimal training data and allow for application of well-established interpretability methods. We are first to present a systematic evaluation of advanced domain-adaptation and prompting methods in a lower-resource medical domain task, performing multi-class section classification on German doctor’s letters. We evaluate a variety of models, model sizes (further-pre)training and task settings, and conduct extensive class-wise evaluations supported by Shapley values to validate the quality of small-scale training data and to ensure interpretability of model predictions. We show that in few-shot learning scenarios, a lightweight, domain-adapted pretrained language model, prompted with just 20 shots per section class, outperforms a traditional classification model, by increasing accuracy from $48.6\%$ to $79.1\%$ . By using Shapley values for model selection and training data optimization, we could further increase accuracy up to $84.3\%$ . Our analyses reveal that pretraining of masked language models on general-language data is important to support successful domain-transfer to medical language, so that further-pretraining of general-language models on domain-specific documents can outperform models pretrained on domain-specific data only. Our evaluations show that applying prompting based on general-language pretrained masked language models combined with further-pretraining on medical-domain data achieves significant improvements in accuracy beyond traditional models with minimal training data. Further performance improvements and interpretability of results can be achieved, using interpretability methods such as Shapley values. Our findings highlight the feasibility of deploying powerful machine learning methods in clinical settings and can serve as a process-oriented guideline for lower-resource languages and domains such as clinical information extraction projects.

Read full abstract

Abstract Consistency issues limit the sharing of horticultural data across multiple systems, resulting in challenges for users to analyze data effectively across various systems utilizing artificial intelligence technology. Introducing data governance principles can help standardize and unify data practices, making it easier for analysts to locate, comprehend, transfer and integrate data from diverse sources to enable data-driven horticulture. Implementing data governance and principles specific to horticulture can assist in standardizing the layout and format of data structures from different sources. This study aims to propose a new governance framework, Horti-IoT, based on the Data Management Body of Knowledge and several structured frameworks for the Internet of Things (IoT) governance that will lead to data-driven horticulture. This study is empirical in nature. The Dutch horticulture stakeholders are involved in this initiative, providing the data, knowledge, and experiences needed for this study. The data stream from various sources, including camera images, sap flow sensors, climate sensors and manually measured growth data. The key findings following the implementation of the Horti-IoT framework’s principles are reduced workload for data analysts, efficiency in plant monitoring, savings time in pre-processing, enhanced water resource management, reduced system administrator contacts and compliance with General Data Privacy Regulation. The new proposed Horti-IoT framework, compatible with Dutch horticulture, is presented. The data were obtained from the Lab greenhouse at the World Horti Centre in the Netherlands, in the framework of the Regionale SIA RAAK MKB call March 2022-September 2024 subsidy funds for project title ‘Gewasgroei Goed Gemeten (GeGoGe). This project is a collaboration between three educational institutions. Inholland University of Applied Science, the Hague University of Applied Science, Lentiz Vocational School, and stakeholders.

Read full abstract

Privacy Regulations Research Articles

Related Topics

Articles published on Privacy Regulations

Fractals as Pre-Training Datasets for Anomaly Detection and Localization

Data Privacy Concerns and their Impact on Consumer Trust in Digital Marketing

Integrating Blockchain and Homomorphic Encryption to Enhance Security and Privacy in Project Management and Combat Counterfeit Goods in Global Supply Chain Operations

Recent Advancements in Federated Learning: State of the Art, Fundamentals, Principles, IoT Applications and Future Trends

Federated learning with tensor networks: a quantum AI framework for healthcare

Multidimensional Impact Analysis of Public Policies on the European Economy: Taking EU Member States as an Example

Evaluating the Role of Data Privacy Regulations in Secure Software Development Life Cycles (SDLC)

INTERNATIONAL COMMUNITY IN THE GLOBAL DIGITAL ECONOMY: A CASE STUDY ON THE AFRICAN DIGITAL TRADE FRAMEWORK

Leveraging Federated Learning for Privacy-Preserving Analysis of Multi-Institutional Electronic Health Records in Rare Disease Research

Synthetic data generation with hybrid quantum-classical models for the financial sector

Clinical information extraction for lower-resource languages and domains with few-shot learning using pretrained language models and prompting

How does the consent model in Australia’s Privacy Act 1988 (Cth) undermine our human right to privacy?

Horti-IoT: a new data framework for Dutch horticulture

A Review on Robust Credit Card Fraud Detection System Leveraging Big Data and Machine Learning

Speaking in Private: Privacy Expectations Depend on Communication Modality

Efficient Credit Card Fraud Detection System Using Big Data and Machine Learning

The potential of federated learning for self-configuring medical object detection in heterogeneous data distributions

Lobbying global venues: Sitting in or speaking out?

Conversation-Related Advertising and Electronic Eavesdropping: Mapping Perceptions of Phones Listening for Advertising in the United States, the Netherlands, and Poland

Johnny Still Can't Opt-out: Assessing the IAB CCPA Compliance Framework

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Privacy Regulations Research Articles

Related Topics

Articles published on Privacy Regulations

Fractals as Pre-Training Datasets for Anomaly Detection and Localization

Data Privacy Concerns and their Impact on Consumer Trust in Digital Marketing

Integrating Blockchain and Homomorphic Encryption to Enhance Security and Privacy in Project Management and Combat Counterfeit Goods in Global Supply Chain Operations

Recent Advancements in Federated Learning: State of the Art, Fundamentals, Principles, IoT Applications and Future Trends

Federated learning with tensor networks: a quantum AI framework for healthcare

Multidimensional Impact Analysis of Public Policies on the European Economy: Taking EU Member States as an Example

Evaluating the Role of Data Privacy Regulations in Secure Software Development Life Cycles (SDLC)

INTERNATIONAL COMMUNITY IN THE GLOBAL DIGITAL ECONOMY: A CASE STUDY ON THE AFRICAN DIGITAL TRADE FRAMEWORK

Leveraging Federated Learning for Privacy-Preserving Analysis of Multi-Institutional Electronic Health Records in Rare Disease Research

Synthetic data generation with hybrid quantum-classical models for the financial sector

Clinical information extraction for lower-resource languages and domains with few-shot learning using pretrained language models and prompting

How does the consent model in Australia’s Privacy Act 1988 (Cth) undermine our human right to privacy?

Horti-IoT: a new data framework for Dutch horticulture

A Review on Robust Credit Card Fraud Detection System Leveraging Big Data and Machine Learning

Speaking in Private: Privacy Expectations Depend on Communication Modality

Efficient Credit Card Fraud Detection System Using Big Data and Machine Learning

The potential of federated learning for self-configuring medical object detection in heterogeneous data distributions

Lobbying global venues: Sitting in or speaking out?

Conversation-Related Advertising and Electronic Eavesdropping: Mapping Perceptions of Phones Listening for Advertising in the United States, the Netherlands, and Poland

Johnny Still Can't Opt-out: Assessing the IAB CCPA Compliance Framework