Abstract

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.

Highlights

  • Ensembl enables genome science by systematically integrating, harmonizing and presenting data in a consistent manner, both via a web interface and via application programming interfaces (APIs) [1,2,3]

  • To classify our genes for homology analysis across all phyla, we have developed a new approach using profile hidden Markov models (HMMs)

  • We have developed a prototype RESTful API at http://test-metadata.ensembl.org/ to aid in finding available resources in Ensembl

Read more

Summary

Introduction

Ensembl enables genome science by systematically integrating, harmonizing and presenting data in a consistent manner, both via a web interface (https://www.ensembl.org) and via application programming interfaces (APIs) [1,2,3]. We import primary data, such as assemblies and discovered variants, and annotate genes and transcripts [4], variants [5], regulatory regions [6] and comparative genomics features [7].

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call