How should we measure proportionality on relative gene expression data?

Ionas Erb,Cedric Notredame

doi:10.1007/s12064-015-0220-8

Ionas Erb, Cedric Notredame

Open Access

https://doi.org/10.1007/s12064-015-0220-8

Copy DOI

Abstract

Correlation is ubiquitously used in gene expression analysis although its validity as an objective criterion is often questionable. If no normalization reflecting the original mRNA counts in the cells is available, correlation between genes becomes spurious. Yet the need for normalization can be bypassed using a relative analysis approach called log-ratio analysis. This approach can be used to identify proportional gene pairs, i.e. a subset of pairs whose correlation can be inferred correctly from unnormalized data due to their vanishing log-ratio variance. To interpret the size of non-zero log-ratio variances, a proposal for a scaling with respect to the variance of one member of the gene pair was recently made by Lovell et al. Here we derive analytically how spurious proportionality is introduced when using a scaling. We base our analysis on a symmetric proportionality coefficient (briefly mentioned in Lovell et al.) that has a number of advantages over their statistic. We show in detail how the choice of reference needed for the scaling determines which gene pairs are identified as proportional. We demonstrate that using an unchanged gene as a reference has huge advantages in terms of sensitivity. We also explore the link between proportionality and partial correlation and derive expressions for a partial proportionality coefficient. A brief data-analysis part puts the discussed concepts into practice.

Highlights

The frequently compositional nature of biological data and its methodological implications (a.k.a. analysis of ‘‘closed’’ data) have not been widely acknowledged yet Lovell et al (2011)
We base our analysis on a symmetric proportionality coefficient that has a number of advantages over their statistic
We show in detail how the choice of reference needed for the scaling determines which gene pairs are identified as proportional

Summary

Introduction

The frequently compositional nature of biological data and its methodological implications (a.k.a. analysis of ‘‘closed’’ data) have not been widely acknowledged yet Lovell et al (2011). While correlations between the columns of our compositional matrix cannot be defined coherently, the covariance structure of a compositional data matrix can be summarized considering, for all pairs i, j (i\j), the (sample) variances of their log ratios logxxij Aitchison (2003) These will be close to zero if genes i and j maintain an approximately proportional relationship xi ’ mxj across observations for some real value m. In this contribution, we will interpret log-ratio transformations as an attempt to back transform relative data into absolute data. In good agreement with our analytical results, the approach taken by Lovell et al leads to a much lower overlap of prediction between absolute and relative data compared with the application of an approximately unchanged reference

Methods and results

Findings

Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Theory in Biosciences	Publication Date: Jan 13, 2016
Citations: 92	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

How should we measure proportionality on relative gene expression data?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Theory in Biosciences

Lead the way for us

Similar Papers

Evaluation of Quantitative RT-PCR Using Nonamplified and Amplified RNA
Elisa N Ferreira ... Dirce M Carraro
Diagnostic Molecular Pathology | VOL. 19
Elisa N Ferreira, et. al.Elisa N Ferreira ... Dirce M Carraro
01 Mar 2010
Diagnostic Molecular Pathology | VOL. 19

Gene Expression of the Endothelin-1 in Vasospastic Flap Pedicle – an Experimental Study on a Porcine Model
Petr Hýža ... Daniel Schwarz
Acta Veterinaria Brno | VOL. 79
Petr Hýža, et. al.Petr Hýža ... Daniel Schwarz
01 Jan 2009
Acta Veterinaria Brno | VOL. 79

Assessing chronic liver toxicity based on relative gene expression data
Kedar Kulkarni ... Andreas A Linninger
Journal of Theoretical Biology | VOL. 254
Kedar Kulkarni, et. al.Kedar Kulkarni ... Andreas A Linninger
31 May 2008
Journal of Theoretical Biology | VOL. 254

Table S5 from RASSF1A Suppresses the Invasion and Metastatic Potential of Human Non–Small Cell Lung Cancer Cells by Inhibiting YAP Activation through the GEF-H1/RhoB Pathway
Alexander Hergovich ... Olivier Calvayrac
-
Alexander Hergovich, et. al.Alexander Hergovich ... Olivier Calvayrac
30 Mar 2023
30 Mar 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How should we measure proportionality on relative gene expression data?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Theory in Biosciences