Abstract
RNA-seq is being used increasingly for gene expression studies and it is revolutionizing the fields of genomics and transcriptomics. However, the field of RNA-seq analysis is still evolving. Therefore, we specifically designed this study to contain large numbers of reads and four biological replicates per condition so we could alter these parameters and assess their impact on differential expression results. Bacillus thuringiensis strains ATCC10792 and CT43 were grown in two Luria broth medium lots on four dates and transcriptomics data were generated using one lane of sequence output from an Illumina HiSeq2000 instrument for each of the 32 samples, which were then analyzed using DESeq2. Genome coverages across samples ranged from 87 to 465X with medium lots and culture dates identified as major variation sources. Significantly differentially expressed genes (5% FDR, two-fold change) were detected for cultures grown using different medium lots and between different dates. The highly differentially expressed iron acquisition and metabolism genes, were a likely consequence of differing amounts of iron in the two media lots. Indeed, in this study RNA-seq was a tool for predictive biology since we hypothesized and confirmed the two LB medium lots had different iron contents (~two-fold difference). This study shows that the noise in data can be controlled and minimized with appropriate experimental design and by having the appropriate number of replicates and reads for the system being studied. We outline parameters for an efficient and cost effective microbial transcriptomics study.
Highlights
Ever decreasing next-generation sequencing (NGS) costs, continued technical and analytical advances, along with diverse applications have made RNA-sequencing (RNA-seq) an ever increasing choice for transcriptome studies (Croucher and Thomson, 2010; Marguerat and Bahler, 2010; Williams et al, 2014)
Post-trimming and mapping results for strain ATCC10792 is provided in Table 1 and similar results were obtained for strain CT43 (Supplementary File 1)
The ribosomal RNA depletion strategy worked well and for both strains as indicated by an analysis showing that on average for both strains only 0.07% of trimmed, mapped reads aligned to the 5S, 16S, and 23S rRNA genes (S.D. ± 0.05 and 0.06 for ATCC 10792 and CT43, respectively)
Summary
Ever decreasing next-generation sequencing (NGS) costs, continued technical and analytical advances, along with diverse applications have made RNA-sequencing (RNA-seq) an ever increasing choice for transcriptome studies (Croucher and Thomson, 2010; Marguerat and Bahler, 2010; Williams et al, 2014). Microbial RNA-seq Considerations isoforms, identification of specific SNP’s and their locations, long and small RNAs, genome guided, and de novo transcript assemblies and start sites analyses (Martin and Wang, 2011; McGettigan, 2013; Mutz et al, 2013). It enables detection of weakly expressed genes and does not have to be limited by previously sequenced genome knowledge (Marguerat and Bahler, 2010). Sequence data in the form of raw reads are quality filtered/trimmed, most often aligned to a reference genome, the number of reads mapped to individual genes in the reference genome are counted and further used to estimate differential gene expression using a range of statistical methods (Auer and Doerge, 2010; Marguerat and Bahler, 2010; Oshlack et al, 2010)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.