Gene expression profiling is an effective method for identifying predictive and prognostic biomarkers. However, measurements are prone to uncertainty and errors due to various pre-analytical variables. Systematic evaluating effects of these variables on gene expression measurements and relative expression orderings (REOs) of gene pairs, is necessary. A total of 18 datasets were collected, comprising over 800 paired samples. These paired samples were utilized to assess the impact of pre-analytical variables on gene expression measurements and REOs, including sampling methods, tumor sample heterogeneity, fixed time delays, preservation conditions, degradation levels, library preparation kits, amplification kits, RNA quantity, measuring platforms, and laboratory sites at single and multi-variable level. Low-quality samples served as the case group, while paired high-quality samples constituted the control group. In both single and multiple variable analyses, comparing each case sample to paired control sample revealed thousands of genes exhibited a twofold change in expression values. In contrast, on average, 82% and 76% of gene pairs keep consistent REO pattern between paired samples in single-variable and multi-variable analyses, respectively. Notably, the rate steadily increased after excluding gene pairs with the closest expression levels. Statistical analyses shown a higher proportion of differentially expressed genes (DEGs) than that of reversed gene pairs between case and control groups in both single-variable and multi-variable analyses. Furthermore, the proportion of reversal gene pairs among all gene pairs involving DEGs remained below 20% in the majority of comparisons. Our research demonstrates that REOs exhibit higher robustness under the influence of pre-analytical variables. These findings indicate the potential of the REOs-based approach in transcriptomics research and its applicability for biomarker studies.
Read full abstract