Heterogeneous Data Resources Research Articles

Hierarchical mixed effects models have been demonstrated to be powerful for predicting genomic merit of livestock and plants, on the basis of high-density single-nucleotide polymorphism (SNP) marker panels, and their use is being increasingly advocated for genomic predictions in human health. Two particularly popular approaches, labeled BayesA and BayesB, are based on specifying all SNP-associated effects to be independent of each other. BayesB extends BayesA by allowing a large proportion of SNP markers to be associated with null effects. We further extend these two models to specify SNP effects as being spatially correlated due to the chromosomally proximal effects of causal variants. These two models, that we respectively dub as ante-BayesA and ante-BayesB, are based on a first-order nonstationary antedependence specification between SNP effects. In a simulation study involving 20 replicate data sets, each analyzed at six different SNP marker densities with average LD levels ranging from r(2) = 0.15 to 0.31, the antedependence methods had significantly (P < 0.01) higher accuracies than their corresponding classical counterparts at higher LD levels (r(2) > 0. 24) with differences exceeding 3%. A cross-validation study was also conducted on the heterogeneous stock mice data resource (http://mus.well.ox.ac.uk/mouse/HS/) using 6-week body weights as the phenotype. The antedependence methods increased cross-validation prediction accuracies by up to 3.6% compared to their classical counterparts (P < 0.001). Finally, we applied our method to other benchmark data sets and demonstrated that the antedependence methods were more accurate than their classical counterparts for genomic predictions, even for individuals several generations beyond the training data.

Read full abstract

As increasing volumes and varieties of data are becoming available online, the challenges of accessing and using heterogeneous data resources are growing. We have developed a mediator-based data integration system called Cartel for biological oceanography data. A mediation approach is appropriate in cases where a single central warehouse is not desirable, such as when the needed data sources change frequently through time, or when there are advantages for holding heterogeneous data in their native formats. Through Cartel, data sources of a variety of types can be registered to the system, and users can query against simplified virtual schemas, without needing to know the underlying schema and computational capabilities of each data source. The system can operate on a variety of relational and geospatial data formats, and can perform joins between formats. We tested the performance of the Cartel mediator in two biological oceanography application areas, and found that the system was able to support the variety of data types needed in a typical ecology study, but that the response times were unacceptably slow when very large databases (i.e. Ocean Biogeographic Information System and the World Ocean Atlas) were used. Indexing and caching are currently being added to the system to improve response times. The mediator is an open-source product, and was developed to be a generic, extensible component available to projects developing oceanography data systems.

Read full abstract

Heterogeneous Data Resources Research Articles

Related Topics

Articles published on Heterogeneous Data Resources

Hypertext configurations: Genres in networked digital media

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

STSM: An Infrastructure for Unifying Steel Knowledge and Discovering New Knowledge

Workshop report: Identifying opportunities for global integration of toxicogenomics databases, 26-27 June 2013, Research Triangle Park, NC, USA.

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

An Efficient and Scalable Approach for Ontology Instance Matching

Visualizing Information Science Knowledge by Modelling Domain Ontology (OIS)

Querying Uncertain Data in Geospatial Object-relational Databases Using SQL and Fuzzy Sets

Study on Data Warehouse Based Equipment Support Data Management

Research on Heterogeneous Data resource Management Model in Cloud Environment

Visualizing Information Science Knowledge by Modelling Domain Ontology (OIS)

Hybrid Ground Data Model for Interacting Simulations in Mechanized Tunneling

Metadata-based Information Resource Integration for Research Management

Applying knowledge-anchored hypothesis discovery methods to advance clinical and translational research: the OAMiner project

A Bayesian Antedependence Model for Whole Genome Prediction

Multi Dimension Knowledge Mining in Heterogeneous Data Resources

ConsensusPathDB: toward a more complete picture of cell biology

Achieving Interoperation of Grid Data Resources via Workflow Level Integration

Bringing together an ocean of information: An extensible data integration framework for biological oceanography

SDS: A Scalable Data Services System in Data Grid

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Heterogeneous Data Resources Research Articles

Related Topics

Articles published on Heterogeneous Data Resources

Hypertext configurations: Genres in networked digital media

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

STSM: An Infrastructure for Unifying Steel Knowledge and Discovering New Knowledge

Workshop report: Identifying opportunities for global integration of toxicogenomics databases, 26-27 June 2013, Research Triangle Park, NC, USA.

Retracted: Semantic Information Integration with Linked Data Mashups Approaches

An Efficient and Scalable Approach for Ontology Instance Matching

Visualizing Information Science Knowledge by Modelling Domain Ontology (OIS)

Querying Uncertain Data in Geospatial Object-relational Databases Using SQL and Fuzzy Sets

Study on Data Warehouse Based Equipment Support Data Management

Research on Heterogeneous Data resource Management Model in Cloud Environment

Visualizing Information Science Knowledge by Modelling Domain Ontology (OIS)

Hybrid Ground Data Model for Interacting Simulations in Mechanized Tunneling

Metadata-based Information Resource Integration for Research Management

Applying knowledge-anchored hypothesis discovery methods to advance clinical and translational research: the OAMiner project

A Bayesian Antedependence Model for Whole Genome Prediction

Multi Dimension Knowledge Mining in Heterogeneous Data Resources

ConsensusPathDB: toward a more complete picture of cell biology

Achieving Interoperation of Grid Data Resources via Workflow Level Integration

Bringing together an ocean of information: An extensible data integration framework for biological oceanography

SDS: A Scalable Data Services System in Data Grid