Abstract

BackgroundResearch involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories. Comparatively complete and satisfactory annotated public sequence libraries are, however, available only for a limited range of organisms, rendering the absence of sequences and gene structure information a tangible problem for those working with taxa lacking an EST or genome sequencing project. Paralogous genes belonging to the same gene family but distinguished by derived characteristics are particularly prone to misidentification and erroneous annotation; high but incomplete levels of sequence similarity are typically difficult to interpret and have formed the basis of many unsubstantiated assumptions of orthology.In these cases, a phylogenetic study of the query sequence together with the most similar sequences in the database may be of great value to the identification process. In order to facilitate this laborious procedure, a project to employ automated phylogenetic analysis in the identification of ESTs was initiated.ResultsgalaxieEST is an open source Perl-CGI script package designed to complement traditional similarity-based identification of EST sequences through employment of automated phylogenetic analysis. It uses a series of BLAST runs as a sieve to retrieve nucleotide and protein sequences for inclusion in neighbour joining and parsimony analyses; the output includes the BLAST output, the results of the phylogenetic analyses, and the corresponding multiple alignments. galaxieEST is available as an on-line web service for identification of fungal ESTs and for download / local installation for use with any organism group at .ConclusionsBy addressing sequence relatedness in addition to similarity, galaxieEST provides an integrative view on EST origin and identity, which may prove particularly useful in cases where similarity searches return one or more pertinent, but not full, matches and additional information on the query EST is needed.

Highlights

  • Research involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories

  • By addressing sequence relatedness in addition to similarity, galaxieEST provides an integrative view on EST origin and identity, which may prove useful in cases where similarity searches return one or more pertinent, but not full, matches and additional information on the query EST is needed

  • Many newly sequenced ESTs are left poorly matched when performing traditional similarity searches on available databases. It might in these cases be a good idea to set up separate BLAST runs against different databases, such as EST-nucleotide, EST-EST, and EST-protein. galaxieEST automates these searches for any given query EST

Read more

Summary

Conclusions

The elucidation of sequence identity, origin, and properties is best done in an evolutionary context. The PerlCGI package galaxieEST was written for that purpose in that it represents an attempt to employ automated phylogenetic analysis in the EST identification process; it is freely available as an on-line web service for fungal ESTs and for download / local installation for use with any organism group. Project name: galaxieEST – addressing EST identity through automated phylogenetic analysis. RHN wrote large parts of the Perl source and the MySQL interface. BR initiated the project and contributed with advice on fungal ESTs and their processing. K-HL contributed with ideas on sequence sampling and phylogenetic analysis. BMU was responsible for the web-interface and the scientific aspects of the scripts.

Background
Results and discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.