Abstract

BackgroundThe FANTOM5 consortium used Cap Analysis of Gene Expression (CAGE) tag sequencing to produce a comprehensive atlas of promoters and enhancers within the human and mouse genomes. We reasoned that the mapping of these regulatory elements to the pig genome could provide useful annotation and evidence to support assignment of orthology.ResultsFor human transcription start sites (TSS) associated with annotated human-mouse orthologs, 17% mapped to the pig genome but not to the mouse, 10% mapped only to the mouse, and 55% mapped to both pig and mouse. Around 17% did not map to either species. The mapping percentages were lower where there was not clear orthology relationship, but in every case, mapping to pig was greater than to mouse, and the degree of homology was also greater. Combined mapping of mouse and human CAGE-defined promoters identified at least one putative conserved TSS for >16,000 protein-coding genes. About 54% of the predicted locations of regulatory elements in the pig genome were supported by CAGE and/or RNA-Seq analysis from pig macrophages.ConclusionsComparative mapping of promoters and enhancers from humans and mice can provide useful preliminary annotation of other animal genomes. The data also confirm extensive gain and loss of regulatory elements between species, and the likelihood that pigs provide a better model than mice for human gene regulation and function.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2111-2) contains supplementary material, which is available to authorized users.

Highlights

  • The Functional Annotation of Mammalian Genomes 5 (FANTOM5) consortium used Cap Analysis of Gene Expression (CAGE) tag sequencing to produce a comprehensive atlas of promoters and enhancers within the human and mouse genomes

  • Within the Functional Annotation of Mammalian Genomes 5 (FANTOM5) consortium, we have noted that single nucleotide variants (SNVs) that are associated with disease susceptibility are strongly enriched within the −300 to +100 window surrounding transcription start sites (TSS) [8], perhaps reflecting the fact that these regions must be in open chromatin and are flanked by a positioned nucleosome [11]

  • Identification of conserved promoters in the pig In a systematic comparison of CAGE-defined mouse promoters with the promoters of human and dog orthologous genes, the average level of conservation peaked around the TSS at around 85% in rat and 65% in human, and declined rapidly towards the genomic background at around −300 and +100 relative to the TSS [9]

Read more

Summary

Results

Identification of conserved promoters in the pig In a systematic comparison of CAGE-defined mouse promoters with the promoters of human and dog orthologous genes, the average level of conservation peaked around the TSS at around 85% in rat and 65% in human, and declined rapidly towards the genomic background at around −300 and +100 relative to the TSS [9]. These uniquely mapped CAGE reads (48,309) might correspond to new candidate TSS in human – not identified in the FANTOM5 project either due to their lack of expression in the cells/tissues/time-points studied or their failing the criteria to be added to the robust set of promoters [9]. A total of 22,209 Unigene sequences were expressed (>= 1 FPKM) in at least one of the macrophage libraries – 1,974 (9%) of which were only expressed in LPS treated cells This number reduces to 1,938 (9%) when removing those Unigene sequences that can be associated with an Ensembl ID based on the mapping of all the unique sequences in the Snowball array probe sets [35] to the pig genome and overlapping those mapped locations with the latest available set of gene annotations (Ensembl API v77)

Conclusions
Background
Discussion and conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call