Analysing the Protein-DNA Binding Sites in Arabidopsis thaliana from ChIP-seq Experiments

Ginés Almagro-Hernández,Juana-María Vivo,Manuel Franco,Jesualdo Tomás Fernández-Breis

doi:10.3390/math9243239

Ginés Almagro-Hernández, Juana-María Vivo + Show 2 more

Open Access

https://doi.org/10.3390/math9243239

Copy DOI

Abstract

Computational genomics aim at supporting the discovery of how the functionality of the genome of the organism under study is affected both by its own sequence and structure, and by the network of interaction between this genome and different biological or physical factors. In this work, we focus on the analysis of ChIP-seq data, for which many methods have been proposed in the recent years. However, to the best of our knowledge, those methods lack an appropriate mathematical formalism. We have developed a method based on multivariate models for the analysis of the set of peaks obtained from a ChIP-seq experiment. This method can be used to characterize an individual experiment and to compare different experiments regardless of where and when they were conducted. The method is based on a multivariate hypergeometric distribution, which fits the complexity of the biological data and is better suited to deal with the uncertainty generated in this type of experiments than the dichotomous models used by the state of the art methods. We have validated this method with Arabidopsis thaliana datasets obtained from the Remap2020 database, obtaining results in accordance with the original study of these samples. Our work shows a novel way for analyzing ChIP-seq data.

Highlights

Computational genomics consists of the use of a wide range of mathematical tools, implemented in specific software, in order to solve challenges such as how the functionality of the genome of the organism under study is affected both by its own sequence and structure, and by the network of interaction between this genome and different biological or physical factors.One of the main types of experiments included in this field is the so-called chromatin immunoprecipitation (ChIP) experiment [1], which aims to identify and localize in vivo all the binding sites of a given DNA-binding protein throughout the genome of an organism, tissue, or cell line subjected to a specific biological condition (e.g. “wild type” or “stress”)
ChIP-seq experiments [3] consist of a first ChIP phase in which the immunoprecipitated fragments of the DNA molecule to which the protein under study has been attached are enriched over the immunoprecipitated fragments corresponding to the rest of the genome
Four Arabidopsis thaliana ChIP-seq datasets correspond to the GSE112951 experiment carried out by Nassrallah et al [35], which analyzed the influence of the lightmediated development protein (DET1) on the pattern of monoubiquitination of histone H2B (H2Bub)

Summary

Introduction

Computational genomics consists of the use of a wide range of mathematical tools, implemented in specific software, in order to solve challenges such as how the functionality of the genome of the organism under study is affected both by its own sequence and structure, and by the network of interaction between this genome and different biological or physical factors (proteins, metabolites, molecular complexes, electromagnetic radiation, etc.).One of the main types of experiments included in this field is the so-called chromatin immunoprecipitation (ChIP) experiment [1], which aims to identify and localize in vivo all the binding sites of a given DNA-binding protein throughout the genome of an organism, tissue, or cell line subjected to a specific biological condition (e.g. “wild type” or “stress”). ChIP-seq experiments [3] consist of a first ChIP phase in which the immunoprecipitated fragments of the DNA molecule (with a length of between 150 and 1000 nucleotides) to which the protein under study has been attached (hereafter referred to as target protein) are enriched over the immunoprecipitated fragments corresponding to the rest of the genome This is followed by a phase of identification of these fragments in two steps. In all four samples the intergenic class had the lowest observed number of monoubiquitinated H2ub sites compared to the expected ones These same patterns could be observed for the background model 8 dm, with very similar Z-scores, with those corresponding to the enhancer class being found in the four samples within the range [−16.0, −15.4], demonstrating that this class was one of the least relevant for the study of the target protein

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics	Publication Date: Dec 14, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Analysing the Protein-DNA Binding Sites in Arabidopsis thaliana from ChIP-seq Experiments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

Processing and analysis of ChIP-seq data
Shan Gao ... Ning Zhang
Hereditas (Beijing) | VOL. 34
Shan Gao, et. al.Shan Gao ... Ning Zhang
29 Jun 2012
Hereditas (Beijing) | VOL. 34

AnnotateGenomicRegions: a web application
Heiko Muller ... Gabriele Bucci
EMBnet.journal | VOL. 18
Heiko Muller, et. al.Heiko Muller ... Gabriele Bucci
09 Nov 2012
EMBnet.journal | VOL. 18

Statistical approaches for the analysis of RNA-Seq and ChIP-seq data and their integration
Claudia Angelini ... Italia De Feis
EMBnet.journal | VOL. 17
Claudia Angelini, et. al.Claudia Angelini ... Italia De Feis
28 Feb 2012
EMBnet.journal | VOL. 17

Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data
A I Dergilev ... I V Chadaeva
Vavilov Journal of Genetics and Breeding | VOL. 20
A I Dergilev, et. al.A I Dergilev ... I V Chadaeva
01 Jan 2015
Vavilov Journal of Genetics and Breeding | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysing the Protein-DNA Binding Sites in Arabidopsis thaliana from ChIP-seq Experiments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics