Abstract
Identifying the A- and P-site locations on ribosome-protected mRNA fragments from Ribo-Seq experiments is a fundamental step in the quantitative analysis of transcriptome-wide translation properties at the codon level. Many analyses of Ribo-Seq data have utilized heuristic approaches applied to a narrow range of fragment sizes to identify the A-site. In this study, we use Integer Programming to identify the A-site by maximizing an objective function that reflects the fact that the ribosome’s A-site on ribosome-protected fragments must reside between the second and stop codons of an mRNA. This identifies the A-site location as a function of the fragment’s size and its 5′ end reading frame in Ribo-Seq data generated from S. cerevisiae and mouse embryonic stem cells. The correctness of the identified A-site locations is demonstrated by showing that this method, as compared to others, yields the largest ribosome density at established stalling sites. By providing greater accuracy and utilization of a wider range of fragment sizes, our approach increases the signal-to-noise ratio of underlying biological signals associated with translation elongation at the codon length scale.
Highlights
Because of the importance of this assignment problem, a number of methods for identifying the A- and P-sites have been created[2,5,6,7,8,9,10,11,12,13]
In the analysis of Ribo-Seq data, mRNA fragments are initially aligned onto the reference transcriptome and their location is reported with respect to their 5′ end
Using tRNA abundances previously estimated from RNA-Seq experiments on S. cerevisiae[16], we find that our Integer Programming method yields the largest anti-correlation compared to the eleven other methods considered (Supplementary Table S8), further supporting the accuracy of our method
Summary
Because of the importance of this assignment problem, a number of methods for identifying the A- and P-sites have been created[2,5,6,7,8,9,10,11,12,13] Many of these approaches utilize the biological fact that only the P-site is permitted to occupy the start codon during translation initiation and only the A-site is permitted to occupy the stop codon during termination. We apply our method to S. cerevisiae and mESCs Ribo-Seq datasets and show that, compared to other methods, our approach has greater accuracy and statistical power in identifying A- and P-site locations and assigning read density
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.