Abstract

The Neuroimaging Data Model (NIDM) is a series of specifications for describing all aspects of the neuroimaging data lifecycle from raw data to analyses and provenance. NIDM uses community-driven terminologies along with unambiguous data dictionaries within a Resource Description Framework (RDF) document to describe data and metadata for integration and query. Data from different studies, using locally defined variable names, can be retrieved by linking them to higher-order concepts from established ontologies and terminologies. Through these capabilities, NIDM documents are expected to improve reproducibility and facilitate data discovery and reuse. PyNIDM is a Python toolbox supporting the creation, manipulation, and querying of NIDM documents. Using the query tools available in PyNIDM, users are able interrogate datasets to find studies that have collected variables measuring similar phenotypic properties. This, in turn, facilitates the transformation and combination of data across multiple studies. The focus of this manuscript is the linear regression tool which is a part of the PyNIDM toolbox and works directly on NIDM documents. It provides a high-level statistical analysis that aids researchers in gaining more insight into the data that they are considering combining across studies. This saves researchers valuable time and effort while showing potential relationships between variables. The linear regression tool operates through a command-line interface integrated with the other tools (pynidm linear-regression) and provides the user with the opportunity to specify variables of interest using the rich query techniques available for NIDM documents and then conduct a linear regression with optional contrast and regularization.

Highlights

  • The Neuroimaging Data Model (NIDM) (Keator et al 2013; NIDM Working Group; Maumet et al 2016) (Neuroimaging Data Model, RRID:SCR_013667) was started by an international team of volunteers to create specifications for describing all aspects of the neuroimaging data lifecycle

  • These graphs can be serialized into a variety of text-based formats (NIDM documents), and with the capabilities of the semantic web, can be used to link datasets together through annotations with terms from formal terminologies, complete data dictionaries of study variables, and linkage of study variables to broader concepts

  • Implementation and use cases The linear regression tool, nidm_linreg, uses the PyNIDM query functionality to aggregate data in NIDM documents serialized using the standard Terse Resource Description Framework (RDF) Triple Language (TURTLE) (“RDF 1.1 Turtle”), a common semantic-web serialization format that is both structured for ease of use with computers and relatively easy for humans to read

Read more

Summary

Introduction

Background The Neuroimaging Data Model (NIDM) (Keator et al 2013; NIDM Working Group; Maumet et al 2016) (Neuroimaging Data Model, RRID:SCR_013667) was started by an international team of volunteers to create specifications for describing all aspects of the neuroimaging data lifecycle. Using sematic web techniques (“Semantic Web - W3C”), these specifications were envisioned to capture information on all aspects of the neuroimaging data lifecycle, producing graphs linking each result’s artifact with the workflow that produced it and the data used in the computation These graphs can be serialized into a variety of text-based formats (NIDM documents), and with the capabilities of the semantic web, can be used to link datasets together through annotations with terms from formal terminologies, complete data dictionaries of study variables, and linkage of study variables to broader concepts. The algorithm has the ability to query for specific variables or across similar variables from different studies using concept annotations on the variables It provides the user with the ability to construct arbitrary linear models on those data, supporting interactions between variables, contrasts of learned parameter sets, and L1 and L2 regularization (Nagpal 2017).

Methods
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.