Robust Data Model Research Articles

Abstract The objective of the National Cancer Institutes' Proteomic Data Commons (PDC) is to make cancer-related proteomic datasets accessible to the public. The PDC provides the cancer research community with a unified data repository that enables data sharing across cancer proteomic studies and also enables multi-omic integration in support of precision medicine. As a domain-specific repository within the Cancer Research Data Commons (CRDC), the vision for the PDC is to provide researchers the ability to find and analyze proteomic data across a wide variety of tumor types. Currently, the PDC houses data, supported by a large collection of metadata attributes, for nearly 40 datasets from over 12 cancer types produced by several large-scale cancer research programs, each with cohort sizes greater than 100 patients. The PDC facilitates the analysis of proteomic, genomic, and imaging data derived from the same tumor. Most of the datasets in the PDC also have corresponding genomic and imaging data available in the Genomic Data Commons and The Cancer Imaging Archive respectively. Researchers can discover which genomic variants are detectable at the protein-level or better understand associations between gene expression, copy number variation, and protein abundance. The resource is currently available to the public in beta phase (https://pdc.esacinc.com) and will be officially launched on the cancer.gov domain in March 2020. The PDC data portal is supported by a robust and extensible data model and provides user-friendly exploration, visualization and data analysis. This allows researchers to search for and visualize expression of proteins (through their mapped genes) across all studies, analyze protein abundance for all cases in a study through heatmaps, build and explore pan-cancer cohorts using highly curated, clinical metadata, and comprehensively view a study without needing to download the data. The PDC provides quick access to mapping of peptide identities and quantities on the human genome as well as protein databases containing patient/tumor-specific variants and novel splicing events. It also enables fast, accurate, and convenient proteomic validation of novel genomic alterations through the PepQuery algorithm. Through a highly versatile application programming interface (API), PDC allows users to interact with data programmatically and facilitates integration with data from other resources in their scripts for multi-omic analysis. Big data interoperability is critical for progress in precision medicine. PDC is designed to interoperate with other resources including the CRDC nodes, allowing users to analyze PDC data with the tools and pipelines available on the NCI cloud resources. It further allows users to use their own tools to co-analyze genomic and proteomic data available from a common sample on Amazon Web Services (AWS) platform or on a local system. The presentation will provide an overview of the PDC and it's available datasets, as well as a discussion of how it facilitates multi-omic data analyses. Citation Format: Ratna Rajesh Thangudu, Paul A. Rudnick, Michael Holck, Deepak Singhal, Michael J. MacCoss, Nathan J. Edwards, Karen A. Ketchum, Christopher R. Kinsinger, Erika Kim, Anand Basu. Proteomic Data Commons: A resource for proteogenomic analysis [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr LB-242.

Read full abstract

BackgroundBoth air pollution and airborne pollen can cause respiratory health problems. Since both are often jointly present in ambient air, it is important to control for one while estimating the effect of the other when considering pollution-abating policies. To date only a limited number of studies have considered the health effects of both irritants jointly for a general population, and for a sufficiently long time period to allow for variation in seasonal concentrations of both components. The primary goal of this study is to determine the causal impact of fine particulate matter (PM2.5) on hospital visits and related treatment costs, while controlling for potentially confounding pollen effects. Our study area is the metropolitan hub of Reno/Sparks in Northern Nevada. MethodsTaking advantage of a rare sample of daily pollen counts over a prolonged period of time (2009–2015), we model the effects of PM2.5 and pollen on respiratory-related hospital admissions for the population at large, plus specific age groups. Pollen data are provided by a local allergy clinic. Data on PM2.5 and other air pollutants are obtained from the U.S. Environmental Protection Agency's air quality data web site. We collect daily meteorological data from the National Centers for Environmental Information's data repository. Data on hospital admissions are given by the Nevada Center for Surveys, Evaluations, and Statistics. Our econometric approach centers on a fully robust count data (Poisson) model, estimated via Quasi-Maximum Likelihood. ResultsWe find that for our sample PM2.5 effects are largely robust to the inclusion of both pollen counts and temporal indicators. In contrast, pollen effects vanish when time fixed effects are added, pointing at their correlation with unobserved temporal confounders. At the same time, model fit improves with the inclusion of temporal indicators. Based on our preferred specification, we find a significant PM2.5 effect of approximately 0.5% additional hospital visits per day due to a one μg/m3 increase in PM2.5. This translates into expected augmented treatment costs of $2700 per day for the same unit-change in PM2.5. These figures can mount quickly when more pronounced and/or longer episodes of particulate matter pollution are considered, perhaps due to wildfire smoke. For instance, the expected increase in patients and costs due to a month-long 10-unit-jump of PM2.5 over the long-run annual average would amount to an extra 70 patients and approximately $680,000 in additional treatment costs.

Read full abstract

Robust Data Model Research Articles

Articles published on Robust Data Model

Digitalization and Dynamic Criticality Analysis for Railway Asset Management

Edge–cloud collaborative estimation lithium-ion battery SOH based on MEWOA-VMD and Transformer

The Digital Divide, Wealth, and Inequality: An Examination of Socio-Economic Determinants of Collaborative Environmental Governance in Thailand through Provincial-Level Panel Data Analysis

An information entropy-based fuzzy stochastic configuration network for robust data modeling

Enhancing municipal solid waste leachate treatment efficiency: AI-based prediction of electrocoagulation/flocculation recovery using iron electrodes

Exploring the Influence of the Digital Economy on Energy, Economic, and Environmental Resilience: A Multinational Study across Varied Carbon Emission Groups

Redefining governance: a critical analysis of sustainability transformation in e-governance.

Improvement of pasture biomass modelling using high-resolution satellite imagery and machine learning

The Morais Dictionary: Following Best Practices in a Retro-digitized Dictionary Project

Prediction of X-ray fluorescence copper grade using regularized stochastic configuration networks

Flexible, robust and minimal-overhead Event Data Model for track reconstruction in ACTS

A robust transfer deep stochastic configuration network for industrial data modeling

Tropical cyclone frequency: turning paleoclimate into projections

Robust Dynamic Space-Time Panel Data Models Using?-Contamination: an Application to Crop Yields and Climate Change

Effects of Urban Producer Service Industry Agglomeration on Export Technological Complexity of Manufacturing in China.

Abstract LB-242: Proteomic Data Commons: A resource for proteogenomic analysis

Respiratory illness, hospital visits, and health costs: Is it air pollution or pollen?

Spectral and spatial kernel water quality mapping.

A Novel Hybrid Machine Learning Algorithm for Limited and Big Data Modeling With Application in Industry 4.0

The Political Economy of Capital Flight: Governance Quality and Capital Flight in the East Africa Community

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Robust Data Model Research Articles

Articles published on Robust Data Model

Digitalization and Dynamic Criticality Analysis for Railway Asset Management

Edge–cloud collaborative estimation lithium-ion battery SOH based on MEWOA-VMD and Transformer

The Digital Divide, Wealth, and Inequality: An Examination of Socio-Economic Determinants of Collaborative Environmental Governance in Thailand through Provincial-Level Panel Data Analysis

An information entropy-based fuzzy stochastic configuration network for robust data modeling

Enhancing municipal solid waste leachate treatment efficiency: AI-based prediction of electrocoagulation/flocculation recovery using iron electrodes

Exploring the Influence of the Digital Economy on Energy, Economic, and Environmental Resilience: A Multinational Study across Varied Carbon Emission Groups

Redefining governance: a critical analysis of sustainability transformation in e-governance.

Improvement of pasture biomass modelling using high-resolution satellite imagery and machine learning

The Morais Dictionary: Following Best Practices in a Retro-digitized Dictionary Project

Prediction of X-ray fluorescence copper grade using regularized stochastic configuration networks

Flexible, robust and minimal-overhead Event Data Model for track reconstruction in ACTS

A robust transfer deep stochastic configuration network for industrial data modeling

Tropical cyclone frequency: turning paleoclimate into projections

Robust Dynamic Space-Time Panel Data Models Using?-Contamination: an Application to Crop Yields and Climate Change

Effects of Urban Producer Service Industry Agglomeration on Export Technological Complexity of Manufacturing in China.

Abstract LB-242: Proteomic Data Commons: A resource for proteogenomic analysis

Respiratory illness, hospital visits, and health costs: Is it air pollution or pollen?

Spectral and spatial kernel water quality mapping.

A Novel Hybrid Machine Learning Algorithm for Limited and Big Data Modeling With Application in Industry 4.0

The Political Economy of Capital Flight: Governance Quality and Capital Flight in the East Africa Community