Abstract

ABSTRACT
 ObjectivesIn several disciplines such as in biomedicine and social sciences the analysis of individual-level data or the co-analysis of data from different studies requires the pooling and the sharing of those data. However, sharing and combining sensitive individual-level data is often prohibited by ethico-legal constraints and other barriers such as the control maintenance and the huge sample sizes. The graphical illustration of microdata is also often forbidden as can potentially be unsecured on the identification of sensitive information. For example the plot of a standard scatterplot is disclosive as can explicitly specify the exact values of two measurements for each single individual.
 ApproachDataSHIELD (www.datashield.ac.uk) is a novel approach that allows the analysis of sensitive individual-level data and the co-analysis of such data from several studies simultaneously without physically pooling the data.
 ResultsDataSHIELD functionality consists of several functions that provide the flexibility of performing data analysis through different statistical techniques. A part of this environment includes a number of graphical-related functions for the graphical illustration of the statistical properties and relationships between different variables. We overview the graphical functions in DataSHIELD (ds.histogram, ds.heatmapPlot, ds.contourPlot) and demonstrate a number of new functions including ds.scatterPlot and ds.boxPlot developed based on the application of different computational approaches like the k-Nearest Neighbours algorithm and ensuring privacy protected analysis.
 ConclusionDataSHIELD graphical functionality has certain methodological features for the representation of the relationships between different variables preserving their statistical properties and assuring the data privacy protection. These graphical approaches can be used or enhanced for application in various areas where confidentiality and information sensitivity is considered, for example in longitudinal data and survival analysis, in epidemiological studies, in geospatial analysis and several others.

Highlights

  • DataSHIELD is a novel approach that allows the analysis of sensitive individual-level data and the co-analysis of such data from several studies simultaneously without physically pooling the data

  • DataSHIELD functionality consists of several functions that provide the flexibility of performing data analysis through different statistical techniques

  • A part of this environment includes a number of graphical-related functions for the graphical illustration of the statistical properties and relationships between different variables

Read more

Summary

Introduction

Demetris1*, Gaye, Amadou1, Isaeva, Julia2, Burton, Thomas1, Wilson, Rebecca1, Turner, Andrew1, and Burton, Paul1 In several disciplines such as in biomedicine and social sciences the analysis of individual-level data or the co-analysis of data from different studies requires the pooling and the sharing of those data.

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.