Abstract

The Eukaryotic Promoter Database (EPD), available online at https://epd.epfl.ch, provides accurate transcription start site (TSS) information for promoters of 15 model organisms plus corresponding functional genomics data that can be viewed in a genome browser, queried or analyzed via web interfaces, or exported in standard formats (FASTA, BED, CSV) for subsequent analysis with other tools. Recent work has focused on the improvement of the EPD promoter viewers, which use the UCSC Genome Browser as visualization platform. Thousands of high-resolution tracks for CAGE, ChIP-seq and similar data have been generated and organized into public track hubs. Customized, reproducible promoter views, combining EPD-supplied tracks with native UCSC Genome Browser tracks, can be accessed from the organism summary pages or from individual promoter entries. Moreover, thanks to recent improvements and stabilization of ncRNA gene catalogs, we were able to release promoter collections for certain classes of ncRNAs from human and mouse. Furthermore, we developed automatic computational protocols to assign orphan TSS peaks to downstream genes based on paired-end (RAMPAGE) TSS mapping data, which enabled us to add nearly 9000 new entries to the human promoter collection. Since our last article in this journal, EPD was extended to five more model organisms: rhesus monkey, rat, dog, chicken and Plasmodium falciparum.

Highlights

  • The Eukaryotic Promoter Database (EPD) was created in 1986 and first published as a table in a journal article [1]

  • The primary goal of EPD has always been to provide accurate transcription start sites (TSS) annotation based on all experimental evidence available at a given time to computational and bench biologists

  • Until about 10 years ago, EPD was a manually curated database derived from experiments published in journal articles

Read more

Summary

Introduction

The Eukaryotic Promoter Database (EPD) was created in 1986 and first published as a table in a journal article [1]. We produce comprehensive, organism-specific promoter collections in a completely automatic fashion from highthroughput transcript mapping data and high-quality gene annotation resources. The current human promoter collection was derived from about 39 trillion (!) sequenced mRNA 5 ends from ENCODE [3] and FANTOM5 [4], using GENCODE [5] as gene annotation resource.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call