Abstract

The development of sequencing technology allows low-cost generation of sequence data. The huge amount of raw sequence data now available has introduced many challenges associated with analysis of these large-scale data banks. For example, it is very important to distinguish materials of plant and fungal origin in fungus-infected plant tissue. The origin of transcripts that were sequenced from Library 895-M6 (poplar tissue infected by Marssonina brunnea) on Illumina/Solexa GA IIx was determined by combining three methods: (1) based on the taxonomic information of homologous sequences; (2) based on the reference genome sequence; (3) based on the transcriptome sequence of the host and its pathogen obtained from Library 895 (poplar) and Library M6 (M. brunnea) as well as Library 895-M6 (mixture of poplar and M. brunnea). We idenified accurately the origin of 80,978 (99.5%) contigs in the mixed poplar and M. brunnea sample (Library 895-M6) by integrating the results from the three methods. The results of this study demonstrate that a combination of these three approaches described here is an effective strategy for determining the origin of sequences in a mixed pool, and provides a basis for further transcriptome analysis of the mixed sample.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call