Abstract

Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation (PAR-CLIP) is a biochemical method for detecting interaction sites of proteins with mRNA. This method introduces T-to-C substitutions at sequenced cDNA that help to detect binding sites on mRNA. However, T-to-C substitutions can also occur due to other reasons such as mismatches or SNPs. Only few statistical procedures exist for detecting binding sites in PAR-CLIP data. Most of these methods do not account for other types of substitutions than those induced by PAR-CLIP, and therefore, also report positions with high T-to-C substitution rates, e.g. SNPs, as binding sites. Moreover, none of these procedures allow to include additional information, e.g. the type of mRNA region, relevant for the biology of microRNA-binding sites. We have developed BayMAP, a procedure based on a fully Bayesian hierarchical model that takes other sources of substitutions into account. Furthermore, this model enables the incorporation of additional information into the analysis of PAR-CLIP data. This incorporation does not only permit a better detection of binding sites, but also a better understanding of the data and the biology of binding sites. In applications to simulated PAR-CLIP data, BayMAP distinguishes binding sites from noise better than existing methods. Additionally, it yields good estimates of the influence of the additional information. We here demonstrate BayMAP's usability for real datasets even when noisy data is present. BayMAP is freely available as an R package at http://stat.math.uni-duesseldorf.de/baymap. Supplementary data are available at Bioinformatics online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call