Abstract
BackgroundNoises and artifacts may arise in several steps of the next-generation sequencing (NGS) process. Recently, an NGS library preparation method called SMART, or Switching Mechanism At the 5′ end of the RNA Transcript, is introduced to prepare ChIP-seq (chromatin immunoprecipitation and deep sequencing) libraries from small amount of DNA material, using the DNA SMART ChIP-seq Kit. The protocol adds Ts to the 3′ end of DNA templates, which is subsequently recognized and used by SMART poly(dA) primers for reverse transcription and then addition of PCR primers and sequencing adapters. The poly(dA) primers, however, can anneal to poly(T) sequences in a genome and amplify DNA fragments that are not enriched in the immunoprecipitated DNA templates. This off-target amplification results in false signals in the ChIP-seq data.ResultsHere, we show that the off-target ChIP-seq reads derived from false amplification of poly(T/A) genomic sequences have unique and strand-specific features. Accordingly, we develop a tool (called “SMARTcleaner”) that can exploit these features to remove SMART ChIP-seq artifacts. Application of SMARTcleaner to several SMART ChIP-seq datasets demonstrates that it can remove reads from off-target amplification effectively, leading to significantly improved ChIP-seq peaks and results.ConclusionsSMARTcleaner could identify and clean the false signals in SMART-based ChIP-seq libraries, leading to improvement in peak calling, and downstream data analysis and interpretation.
Highlights
Noises and artifacts may arise in several steps of the next-generation sequencing (NGS) process
After amplification, sequencing, and read mapping, ChIP-seq reads from poly(T/A) genomic DNAs, due to false priming and amplification, will accumulate next to the poly(T/A) sites in a clear strand-specific manner because the poly(dA) primers only anneal to the DNA strand containing poly(T)
We examined the reads in a human ChIP-seq sample (Additional file 1: Table S1, Dataset 1, SRR3229031) that was prepared using the Clontech DNA SMART ChIP-seq kit and by PE sequencing [14]
Summary
Noises and artifacts may arise in several steps of the next-generation sequencing (NGS) process. As powerful as NGS technology is, its application with limited amounts of biological material, for example, DNA or RNA isolated from a very small number of cells, remains a challenge This is primarily due to the low efficiency in ligating targeted DNA/RNA fragments to the NGS sequencing adaptors, leading to a drop of sequencing reads for low copy DNA/RNA molecules present in a sample [7]. Ligation requires doublestranded DNA (dsDNA) inputs and may result in crossand self-ligation adaptor byproducts [8] To overcome these limitations, SMART, a template switching method, was developed and used initially for transcriptome analyses, such as CAGE, RNA-seq (including small RNA-seq), and single-cell RNA-seq [9,10,11,12]. By using single-step adapter addition, the SMART technology achieves a much-needed sensitivity to accurately amplify picogram quantities of nucleic acids
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.