Abstract

BackgroundMethods to read out naturally occurring or experimentally introduced nucleic acid modifications are emerging as powerful tools to study dynamic cellular processes. The recovery, quantification and interpretation of such events in high-throughput sequencing datasets demands specialized bioinformatics approaches.ResultsHere, we present Digital Unmasking of Nucleotide conversions in K-mers (DUNK), a data analysis pipeline enabling the quantification of nucleotide conversions in high-throughput sequencing datasets. We demonstrate using experimentally generated and simulated datasets that DUNK allows constant mapping rates irrespective of nucleotide-conversion rates, promotes the recovery of multimapping reads and employs Single Nucleotide Polymorphism (SNP) masking to uncouple true SNPs from nucleotide conversions to facilitate a robust and sensitive quantification of nucleotide-conversions. As a first application, we implement this strategy as SLAM-DUNK for the analysis of SLAMseq profiles, in which 4-thiouridine-labeled transcripts are detected based on T > C conversions. SLAM-DUNK provides both raw counts of nucleotide-conversion containing reads as well as a base-content and read coverage normalized approach for estimating the fractions of labeled transcripts as readout.ConclusionBeyond providing a readily accessible tool for analyzing SLAMseq and related time-resolved RNA sequencing methods (TimeLapse-seq, TUC-seq), DUNK establishes a broadly applicable strategy for quantifying nucleotide conversions.

Highlights

  • Methods to read out naturally occurring or experimentally introduced nucleic acid modifications are emerging as powerful tools to study dynamic cellular processes

  • Digital unmasking of nucleotide-conversions in k-mers Digital Unmasking of Nucleotide conversions in K-mers (DUNK) addresses the challenges of distinguishing nucleotideconversions from sequencing error and genuine Single Nucleotide Polymorphism (SNP) in high-throughput sequencing datasets by executing four main steps (Fig. 1): First, a nucleotide conversion-aware read mapping algorithm facilitates the alignment of reads (k-mers) with elevated numbers of mismatches (Fig. 1a)

  • The high-quality nucleotide-conversion signal is deconvoluted from sequencing error and used to compute conversion frequencies for all 3′ intervals taking into account read coverage and base content of the interval (Fig. 1d)

Read more

Summary

Introduction

Methods to read out naturally occurring or experimentally introduced nucleic acid modifications are emerging as powerful tools to study dynamic cellular processes. The recovery, quantification and interpretation of such events in high-throughput sequencing datasets demands specialized bioinformatics approaches. Mismatches in reads yielded from standard sequencing protocols such as genome sequencing and RNA-Seq originate either from genetic variations or sequencing errors and are typically ignored by standard mapping approaches. Beyond these standard applications, a growing number of profiling techniques harnesses nucleotide conversions to monitor naturally occurring or experimentally introduced DNA or RNA modifications. The resulting T > C conversion can be read out by highthroughput sequencing

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call