Abstract

The application of next generation sequencing (NGS) technology in the diagnosis of human pathogens is hindered by the fact that pathogenic sequences, especially viral, are often scarce in human clinical specimens. This known disproportion leads to the requirement of subsequent deep sequencing and extensive bioinformatics analysis. Here we report a method we called “Preferential Amplification of Pathogenic Sequences (PATHseq)” that can be used to greatly enrich pathogenic sequences. Using a computer program, we developed 8-, 9-, and 10-mer oligonucleotides called “non-human primers” that do not match the most abundant human transcripts, but instead selectively match transcripts of human pathogens. Instead of using random primers in the construction of cDNA libraries, the PATHseq method recruits these short non-human primers, which in turn, preferentially amplifies non-human, presumably pathogenic sequences. Using this method, we were able to enrich pathogenic sequences up to 200-fold in the final sequencing library. This method does not require prior knowledge of the pathogen or assumption of the infection; therefore, it provides a fast and sequence-independent approach for detection and identification of human viruses and other pathogens. The PATHseq method, coupled with NGS technology, can be broadly used in identification of known human pathogens and discovery of new pathogens.

Highlights

  • Most abundant human transcripts Top 1000 Top 2000 Top 3000 Top 4000 Top 5000 Top 6000 Top 7000 Top 8000 Top 9000 Top 10000 Top 20000 All 86248

  • The top 1,000 and 2,000 most abundant human transcripts comprise more than 65% and 72% of all human transcripts, respectively16; 3) by selectively excluding the amplification of these abundant human transcripts, we can preferentially amplify pathogenic transcripts in human clinical samples; 4) pathogenic transcripts can be further enriched through subtractive hybridization against a reference human transcription library

  • A recent report found that most protein-coding genes have one major transcript expressed at significantly higher level than others, and in human tissues these major transcripts contribute almost 85 percent to the total mRNA18

Read more

Summary

Introduction

Most abundant human transcripts Top 1000 Top 2000 Top 3000 Top 4000 Top 5000 Top 6000 Top 7000 Top 8000 Top 9000 Top 10000 Top 20000 All 86248. Transcripts; 2) only about 3% of the human genome generates transcripts. The top 1,000 and 2,000 most abundant human transcripts comprise more than 65% and 72% of all human transcripts, respectively16; 3) by selectively excluding the amplification of these abundant human transcripts, we can preferentially amplify pathogenic transcripts in human clinical samples; 4) pathogenic transcripts can be further enriched through subtractive hybridization against a reference (normal) human transcription library (human transcriptome). The PATHseq technology, in combination with NGS technology, has the potential to provide comprehensive and unbiased detection of human pathogens responsible for any infectious disease

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call