Domain Expert Knowledge Research Articles

The Biodiversity Digital Twin (BioDT) project (2022-2025) aims to create prototypes that integrate various data sets, models, and expert domain knowledge enabling prediction capabilities and decision-making support for critical issues in biodiversity dynamics. While digital twin concepts have been applied in industries for continuous monitoring of physical phenomena, their application in biodiversity and environmental sciences presents novel challenges (Bauer et al. 2021, de Koning et al. 2023). In addition, successfully developing digital twins for biodiversity requires addressing interoperability challenges in data standards. BioDT is developing prototype digital twins based on use cases that span various data complexities, from point occurrence data to bioacoustics, covering nationwide forest states to specific communities and individual species. The project relies on FAIR principles (Findable, Accessible, Interoperable, and Reusable) and FAIR enabling resources like standards and vocabularies (Schultes et al. 2020) to enable the exchange, sharing, and reuse of biodiversity information, fostering collaboration among participating research infrastructures (DiSSCo, eLTER, GBIF, and LifeWatch) and data providers. It also involves creating a harmonised abstraction layer using Persistent Identifiers (PID) and FAIR Digital Object (FDO) records, alongside semantic mapping and crosswalk techniques to provide machine-actionable metadata (Schultes and Wittenburg 2019, Schwardmann 2020). Governance and engagement with research infrastructure stakeholders play crucial roles in this regard, with a focus on aligning technical and data standards discussions. In addition to data, models and workflows are key elements in BioDT. Models in the BioDT context are formal representations of problems or processes, implemented through equations, algorithms, or a combination of both, which can be executed by machine entities. The current twin prototypes are considering both statistical and mechanistic models, introducing significant variations in (1) data requirements, (2) modelling approaches and philosophy, and (3) model output. The BioDT consortium will develop guidelines and protocols for how to describe these models, what metadata to include, and how they will interact with the diverse datasets. While discussions on this topic exist within the broader context of biodiversity and ecological sciences (Jeltsch et al. 2013, Fer et al. 2020), the BioDT project is strongly committed to finding a solution within its scope. In the twinning context, data and models need to be executed within a computing infrastructure and also need to adhere to FAIR principles. Software within BioDT includes a suite of tools that facilitate data acquisition, storage, processing, and analysis. While some of these tools already exist, the challenge lies in integrating them within the digital twinning framework. One approach to achieving integration is through workflow representation, encompassing standardised procedures and protocols that guide the acquisition, packaging, processing, and analysis of data. The project is exploring Research Object Crate (RO-Crate) implementation for this (Soiland-Reyes et al. 2022). Implementing workflows can ensure reproducibility, scalability, and transparency in research practices, enabling scientists to validate and replicate findings. The BioDT project offers a novel and transformative approach to biodiversity research and application. By leveraging collaborative research infrastructures and adhering to data standards, BioDT aims to harness the power of data, software, supercomputers, models, and expertise to provide new insights. The foundation provided by the data standards, including those of Biodiversity Information Standards (TDWG), is crucial in realising the full potential of digital twins, facilitating the seamless integration of diverse data sources and combinations with models.

Read full abstract

Refinery industrial processes are very complex with nonlinear dynamics resulting from varying feedstock characteristics and also from changes in product prioritization. Along these processes, there are key properties of intermediate compounds that must be monitored and controlled since they directly affect the quality of the end products commercialized by these manufacturers. However, most of these properties can only be measured through time-consuming and expensive laboratory analysis, which is impossible to obtain in high frequencies, as required to properly monitor them. In this sense, developing soft sensors is the most common way to obtain high-frequency estimations for these measurements, helping advanced control systems to establish the correct setpoints for temperatures, pressures, and other sensors along the refining process, controlling the quality of end products. Since the amount of labeled data is scarce, most academic research has focused on employing semi- supervised learning strategies to develop machine learning (ML) models as soft sensors. Our research, on the other hand, goes in another direction. We aim to elaborate a framework that leverages the knowledge of domain experts and employs data augmentation techniques to build an enhanced fully labeled dataset that could be fed to any supervised ML algorithm to generate a quality soft sensor. We applied our framework together with Automated ML to train a model capable of predicting a specific key property associated with the production of Naphtha compounds in a refinery: the ASTM 95% distillation temperature of the Heavy Naphtha. Although our framework is model agnostic, we opted by using Automated ML for the optimization strategy, since it applies a diverse set of models to the dataset, reducing the bias of utilizing a single optimization algorithm. We evaluated the proposed framework on a case study carried out in an industrial refinery in Brazil, where the previous model in production for estimating the ASTM 95% distillation temperature of the Heavy Naphtha was based entirely on the physicochemical knowledge of the process. By adopting our framework with Automated ML, we were capable of improving the R2 score by 120%. The resulting ML model is currently operating in real-time inside the refinery, leading to significant economic gains.

Read full abstract

Domain Expert Knowledge Research Articles

Related Topics

Articles published on Domain Expert Knowledge

Paying attention to astronomical transients: introducing the time-series transformer for photometric classification

Trace encoding in process mining: A survey and benchmarking

Data Standards and Interoperability Challenges for Biodiversity Digital Twin: A novel and transformative approach to biodiversity research and application

A label machine for mechanical systems: Discovering operating states with unsupervised learning from load time series

Real-Time Energy Management for Marine Applications Using Markov Approximation

Evaluating the Use of Graph Neural Networks and Transfer Learning for Oral Bioavailability Prediction.

Breast cancer classification using deep learned features boosted with handcrafted features

A data fusion approach with mobile phone data for updating travel survey-based mode split estimates

Core–shell clustering approach for detection and analysis of coastal upwelling

A framework for enhancing industrial soft sensor learning models

Fault detection and diagnostics in the context of sparse multimodal data and expert knowledge assistance: Application to hydrogenerators

Towards improving prediction accuracy and user-level explainability using deep learning and knowledge graphs: A study on cassava disease

Knowledge-Graph- and GCN-Based Domain Chinese Long Text Classification Method

A Study on Factors Associated with Child Sexual Abuse and Recognizing the Severity: Special Reference to Galle District

Beyond Low-Pass Filtering: Graph Convolutional Networks With Automatic Filtering

Comparison of two artificial intelligence-augmented ECG approaches: Machine learning and deep learning.

RoSGAS : Adaptive Social Bot Detection with Reinforced Self-supervised GNN Architecture Search

Collaboration between instructional designers and subject matter experts in digital transformation projects

Systematic reviews as a metaknowledge tool: caveats and a review of available options

Wearable-Based Intelligent Emotion Monitoring in Older Adults during Daily Life Activities

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Domain Expert Knowledge Research Articles

Related Topics

Articles published on Domain Expert Knowledge

Paying attention to astronomical transients: introducing the time-series transformer for photometric classification

Trace encoding in process mining: A survey and benchmarking

Data Standards and Interoperability Challenges for Biodiversity Digital Twin: A novel and transformative approach to biodiversity research and application

A label machine for mechanical systems: Discovering operating states with unsupervised learning from load time series

Real-Time Energy Management for Marine Applications Using Markov Approximation

Evaluating the Use of Graph Neural Networks and Transfer Learning for Oral Bioavailability Prediction.

Breast cancer classification using deep learned features boosted with handcrafted features

A data fusion approach with mobile phone data for updating travel survey-based mode split estimates

Core–shell clustering approach for detection and analysis of coastal upwelling

A framework for enhancing industrial soft sensor learning models

Fault detection and diagnostics in the context of sparse multimodal data and expert knowledge assistance: Application to hydrogenerators

Towards improving prediction accuracy and user-level explainability using deep learning and knowledge graphs: A study on cassava disease

Knowledge-Graph- and GCN-Based Domain Chinese Long Text Classification Method

A Study on Factors Associated with Child Sexual Abuse and Recognizing the Severity: Special Reference to Galle District

Beyond Low-Pass Filtering: Graph Convolutional Networks With Automatic Filtering

Comparison of two artificial intelligence-augmented ECG approaches: Machine learning and deep learning.

RoSGAS : Adaptive Social Bot Detection with Reinforced Self-supervised GNN Architecture Search

Collaboration between instructional designers and subject matter experts in digital transformation projects

Systematic reviews as a metaknowledge tool: caveats and a review of available options

Wearable-Based Intelligent Emotion Monitoring in Older Adults during Daily Life Activities