Maintain Model Accuracy Research Articles

Abstract The reproducibility of results obtained using RNA data across labs is a major hurdle in cancer research. Difference in library preparation methods and gene expression quantification platforms prevent the application of trained models to new data across labs. SpinAdapt is a novel unsupervised domain adaptation algorithm that enables the transfer of existing molecular models across labs and technological platforms, without requiring re-training or calibration of existing models for future prospective data. Furthermore, SpinAdapt uses summary statistics (independent latent space representations) to calculate data corrections, rather than requiring full data access. This allows for transfer of molecular models across sequencing platforms and between labs without loss of data ownership or compromise of data privacy. To evaluate SpinAdapt, we performed two sets of experiments: A) We transferred molecular tumor subtype classifiers across four pairs of publicly available cancer datasets (bladder, breast, colorectal, pancreatic), covering 4,076 samples across 18 different tumor subtypes and three technological platforms (RNASeq, Affymetrix U133plus2, and HE1ST). For each pair of datasets we trained a subtype classifier on one dataset (target) according to well-accepted subtyping annotations (Zea Tan et al. 2019; Prat et al. 2012; Guinney et al. 2015; Bailey et al. 2016), and then evaluated the classifier accuracy on the other dataset (source). For each tumor subtype, we quantified the classification performance using mean AUC score across random subsets of the source dataset, where each subset was corrected using SpinAdapt. We aggregated performance across all subtypes and report the average mean AUC score for each cancer type: bladder 0.95, breast 0.98, colorectal 0.98, pancreatic 0.96; demonstrating high accuracy on all diagnostic tasks. B) To demonstrate the transferability of prognostic models, we trained five Cox survival models on five target cancer datasets respectively (breast, lung, colorectal, liver, pancreatic) ranging from 186 to 2,919 RNASeq samples. We used SpinAdapt to adapt five source cancer datasets to the target datasets, ranging from 226 to 1,038 samples across different platforms (RNASeq, Affymetrix U133Plus2 and HG-U133A, Illumina HumanHT-12v4). For every cancer type, we trained a Cox model on the target dataset, and measured its performance by predicting survival risk on the corresponding adapted source dataset. We show high survival prediction accuracy for all datasets (Log-rank P-values and c-index): lung [1e-6, 01.661], breast [5e-5, 0.626], liver [1e-4, 0.708], pancreatic [2e-4, 0.629], colorectal [9e-4, 0.661]. SpinAdapt transferred diagnostic and prognostic models over 14 cancer datasets covering 7,146 samples across six different cancer types and various platforms (RNASeq, microarray), while maintaining model accuracy and statistical significance. Citation Format: Talal Ahmed, Stephane Wenric, Mark Carty, Rafael Pelossof. Transferring diagnostic and prognostic molecular models across technological platforms [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 242.

Read full abstract

In this study, we established a more efficient approach for fractured shale reservoir modeling with an emphasis on simplifying and automating the workflow for assisted history matching and uncertainty quantification. The improvement is especially notable for the process of history matching since the fracture geometry and properties can be directly set as parameters to be history matched. The resultant approach not only shows a significant reduction in the computational time while maintaining model accuracy, but also provides an automatic method for modifying the fracture related parameters - a laborious process in the traditional workflow.In the forward reservoir model, we implemented and extended the Embedded Discrete Fracture Model (Embedded DFM) approach for fractures with arbitrary strike and dip angle to a multiple porosity/permeability setting. The fractures are naturally discretized by the boundary of parent matrix grid blocks. Control volumes of fracture segments are generated according to the specific geometry of each of the segments. Three types of non-neighbor connections are then generated, namely the connection between the fracture segment and its parent matrix grid blocks, the connection between two intersecting fracture segments from different fractures, and the connection between two neighbor fracture segments from the same fracture. For each of the non-neighbor connections, transmissibility can be calculated honoring the physics of the flow.In our approach with Embedded Discrete Fracture Multiple-Porosity Model, the matrix is sub-divided into three porosity types, namely organic matrix (kerogen), inorganic matrix and natural fractures, with the necessary physics included for each of the porosity types. The macro fractures are explicitly represented with Embedded DFM. The proposed model provides a coherent method for characterizing the organic matrix, inorganic matrix, micro fractures as well as the hydraulic fractures of shale reservoirs. It offers a computationally efficient approach for modeling the severe heterogeneity due to hydraulic and natural fractures. Compared with traditional discrete fracture models, fewer grid blocks and lower levels of refinement are required. Compared with multiple porosity method, the proposed model has desirable accuracy for the simulation of reservoirs with large scale fractures.In the history matching and uncertainty quantification stage, due to the low efficiency of traditional Markov Chain Monte Carlo (MCMC) method when applied to reservoir history matching, a more advanced algorithm of two-stage MCMC is employed to evaluate the uncertainty for all the parameters. Since no upscaling of the fracture related parameters is required, the reservoir model can be generated by a pre-processor based on the proposed parameter, which maintains the adequacy of a Gaussian distribution assumption. Therefore, the workflow can be completely automated. By incorporating Embedded DFM and multiple porosity/permeability approaches, the improved model facilitates the history matching of fractured shale reservoirs by cutting the total amount of grid blocks, reducing the complexity of the gridding process, as well as improving the accuracy of fluid transportation within and among different porosity types.

Read full abstract

Maintain Model Accuracy Research Articles

Articles published on Maintain Model Accuracy

1xN Pattern for Pruning Convolutional Neural Networks.

Abstract 242: Transferring diagnostic and prognostic molecular models across technological platforms

Prediction of regional wildfire activity in the probabilistic Bayesian framework of Firelihood.

On Multi-Event Co-Calibration of Dynamic Model Parameters Using Soft Actor-Critic

Hierarchical Fuzzy Neural Networks With Privacy Preservation for Heterogeneous Big Data

An Improved NO Prediction Model for Large Eddy Simulation of Turbulent Combustion

Model Optimization for the Prediction of Red Wine Phenolic Compounds Using Ultraviolet-Visible Spectra.

Integrating deep learning models and multiparametric programming

Adaptive selective catalytic reduction model development using typical operating data in coal-fired power plants

Application of A Simple Landsat-MODIS Fusion Model to Estimate Evapotranspiration over A Heterogeneous Sparse Vegetation Region

Fast and accurate district heating and cooling energy demand and load calculations using reduced-order modelling

Variable selection using Gaussian process regression-based metrics for high-dimensional model approximation with limited data

Solution of the Nonlinear High-Fidelity Generalized Method of Cells Micromechanics Relations via Order-Reduction Techniques

On the current state of flotation modelling for process control

An efficient method for fractured shale reservoir history matching: The embedded discrete fracture multi-continuum approach

StochasticNet in StochasticNet

Partial turbulence simulation method for predicting peak wind loads on small structures and building appurtenances

Multiscale multiphysics and multidomain models--flexibility and rigidity.

Reduced-order discrete element method modeling

Use of upscaled elevation and surface roughness data in two-dimensional surface water models

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Maintain Model Accuracy Research Articles

Articles published on Maintain Model Accuracy

1xN Pattern for Pruning Convolutional Neural Networks.

Abstract 242: Transferring diagnostic and prognostic molecular models across technological platforms

Prediction of regional wildfire activity in the probabilistic Bayesian framework of Firelihood.

On Multi-Event Co-Calibration of Dynamic Model Parameters Using Soft Actor-Critic

Hierarchical Fuzzy Neural Networks With Privacy Preservation for Heterogeneous Big Data

An Improved NO Prediction Model for Large Eddy Simulation of Turbulent Combustion

Model Optimization for the Prediction of Red Wine Phenolic Compounds Using Ultraviolet-Visible Spectra.

Integrating deep learning models and multiparametric programming

Adaptive selective catalytic reduction model development using typical operating data in coal-fired power plants

Application of A Simple Landsat-MODIS Fusion Model to Estimate Evapotranspiration over A Heterogeneous Sparse Vegetation Region

Fast and accurate district heating and cooling energy demand and load calculations using reduced-order modelling

Variable selection using Gaussian process regression-based metrics for high-dimensional model approximation with limited data

Solution of the Nonlinear High-Fidelity Generalized Method of Cells Micromechanics Relations via Order-Reduction Techniques

On the current state of flotation modelling for process control

An efficient method for fractured shale reservoir history matching: The embedded discrete fracture multi-continuum approach

StochasticNet in StochasticNet

Partial turbulence simulation method for predicting peak wind loads on small structures and building appurtenances

Multiscale multiphysics and multidomain models--flexibility and rigidity.

Reduced-order discrete element method modeling

Use of upscaled elevation and surface roughness data in two-dimensional surface water models