Assessing stationary distributions derived from chromatin contact maps

Mark R Segal,Kipper Fletez-Brant

doi:10.1186/s12859-020-3424-y

Abstract

BackgroundThe spatial configuration of chromosomes is essential to various cellular processes, notably gene regulation, while architecture related alterations, such as translocations and gene fusions, are often cancer drivers. Thus, eliciting chromatin conformation is important, yet challenging due to compaction, dynamics and scale. However, a variety of recent assays, in particular Hi-C, have generated new details of chromatin structure, spawning a number of novel biological findings. Many findings have resulted from analyses on the level of native contact data as generated by the assays. Alternatively, reconstruction based approaches often proceed by first converting contact frequencies into distances, then generating a three dimensional (3D) chromatin configuration that best recapitulates these distances. Subsequent analyses can enrich contact level analyses via superposition of genomic attributes on the reconstruction. But, such advantages depend on the accuracy of the reconstruction which, absent gold standards, is inherently difficult to assess. Attempts at accuracy evaluation have relied on simulation and/or FISH imaging that typically features a handful of low resolution probes. While newly advanced multiplexed FISH imaging offers possibilities for refined 3D reconstruction accuracy evaluation, availability of such data is limited due to assay complexity and the resolution thereof is appreciably lower than the reconstructions being assessed. Accordingly, there is demand for new methods of reconstruction accuracy appraisal.ResultsHere we explore the potential of recently proposed stationary distributions, hereafter StatDns, derived from Hi-C contact matrices, to serve as a basis for reconstruction accuracy assessment. Current usage of such StatDns has focussed on the identification of highly interactive regions (HIRs): computationally defined regions of the genome purportedly involved in numerous long-range intra-chromosomal contacts. Consistent identification of HIRs would be informative with respect to inferred 3D architecture since the corresponding regions of the reconstruction would have an elevated number of k nearest neighbors (kNNs). More generally, we anticipate a monotone decreasing relationship between StatDn values and kNN distances. After initially evaluating the reproducibility of StatDns across replicate Hi-C data sets, we use this implied StatDn - kNN relationship to gauge the utility of StatDns for reconstruction validation, making recourse to both real and simulated examples.ConclusionsOur analyses demonstrate that, as constructed, StatDns do not provide a suitable measure for assessing the accuracy of 3D genome reconstructions. Whether this is attributable to specific choices surrounding normalization in defining StatDns or to the logic underlying their very formulation remains to be determined.

Highlights

The spatial configuration of chromosomes is essential to various cellular processes, notably gene regulation, while architecture related alterations, such as translocations and gene fusions, are often cancer drivers
In seeking to devise a more broadly applicable means for reconstruction accuracy assessment we were drawn to the recently proposed (Sobhy et al, [30], hereafter SKLLS) stationary distribution (hereafter Stationary distribution (StatDn)(s)) of a Hi-C matrix and associated highly interactive regions (HIRs): computationally defined regions of the genome purportedly involved in numerous long-range intra-chromosomal contacts
Consistent identification of HIRs would be informative with respect to inferred Three dimensional (3D) architecture since the corresponding regions of the reconstruction would have an elevated number of k nearest neighbors compared with non-highly interacting regions

Summary

Introduction

The spatial configuration of chromosomes is essential to various cellular processes, notably gene regulation, while architecture related alterations, such as translocations and gene fusions, are often cancer drivers. The emergence of the suite of chromatin conformation capture assays, in particular Hi-C, generated new details of chromatin structure and spawned a number of subsequent biological findings [2, 9, 10, 18, 23] Many of these findings have directly resulted from analyses of interaction or contact level data generated by Hi-C assays. A less common Hi-C analysis paradigm proceeds by first converting these contact frequencies into distances, this transformation often invoking inverse power-laws [2, 13, 29, 35, 41]), and generating a putative three dimensional (3D) reconstruction of the associated chromatin configuration via variants of multi-dimensional scaling (MDS) Such 3D reconstruction has been shown to enrich analyses based solely on the underlying contact map, these deriving, in part, from superposing genomic features. Examples include identifying co-localized genomic landmarks such as early replication origins [6, 37], expression gradients and co-localization of virulence genes in the malaria parasite Plasmodium falciparum [2], the impact of spatial organization on double strand break repair [14], and elucidation of ‘3D hotspots’ corresponding to overlaid ChIP-Seq transcription factor maxima, revealing novel regulatory interactions [7]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Assessing stationary distributions derived from chromatin contact maps

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Journal: BMC Bioinformatics	Publication Date: Feb 24, 2020
License type: open-access

Similar Papers

High resolution image reconstruction from projection of low resolution images differing in subpixel shifts
Jacqueline Le Moigne ... Manohar Mareboyana
-
Jacqueline Le Moigne, et. al.Jacqueline Le Moigne ... Manohar Mareboyana
20 May 2016
20 May 2016

Measurements of high energy cosmic rays above 10 PeV with KASCADE-Grande

Wissenschaftliche Berichte FZKA | VOL. 7513

01 Jan 2009
Wissenschaftliche Berichte FZKA | VOL. 7513

Does multi-way, long-range chromatin contact data advance 3D genome reconstruction?
Adam B. Olshen ... Mark R. Segal
BMC Bioinformatics | VOL. 24
Adam B. Olshen, et. al.Adam B. Olshen ... Mark R. Segal
24 Feb 2023
BMC Bioinformatics | VOL. 24

Fusion Genes in Prostate Cancer: A Comparison in Men of African and European Descent.
Rebecca Morgan ... Bethany Wolf
Biology | VOL. 11
Rebecca Morgan, et. al.Rebecca Morgan ... Bethany Wolf
20 Apr 2022
Biology | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing stationary distributions derived from chromatin contact maps

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics