JESAM: CORBA software components to create and publish EST alignments and clusters.

J D Parsons,P Rodriguez-Tomé

doi:10.1093/bioinformatics/16.4.313

Abstract

Expressed Sequence Tags (ESTs) are cheap, easy and quick to obtain relative to full genomic sequencing and currently sample more eukaryotic genes than any other data source. They are particularly useful for developing Sequence Tag Sites (STSs for mapping), polymorphism discovery, disease gene hunting, mass spectrometer proteomics, and most ironically for finding genes and predicting gene structure after the great effort of genomic sequencing. However, ESTs have many problems and the public EST databases contain all the errors and high redundancy intrinsic to the submitted data so it is often found that derived database views, which reduce both errors and redundancy, are more effective starting points for research than the original raw submissions. Existing derived views such as EST cluster databases and consensus databases have never published supporting evidence or intermediary results leading to difficulties trusting, correcting, and customizing the final published database. These difficulties have lead many groups to wastefully repeat the complex intermediary work of others in order to offer slightly different final views. A better approach might be to discover the most expensive common calculations used by all the approaches and then publish all intermediary results. Given a globally accessible database with a suitable component interface, like the JESAM software described in this paper, the creation of customized EST-derived databases could be achieved with minimum effort. Databases of EST and full-length mRNA sequences for four model organisms have been self-compared by searching for overlaps consistent with contiguity. The sequence comparisons are performed in parallel using a PVM process farm and previous results are stored to allow incremental updates with minimal effort. The overlap databases have been published with CORBA interfaces to enable flexible global access as demonstrated by example Java applet browsers. Simple cDNA supercluster databases built as alignment database clients are themselves published via CORBA interfaces browsable with prototypical applets. A comparison with UniGene Mouse and Rat databases revealed undesirable features in both and the advantages of contrasting perspectives on complex data. The software is packaged as two Jar files available from: URL: http://corba.ebi.ac.uk/EST/jesam/jesam. html. One jar contains all the Java source code, and the other contains all the C, C++ and IDL code. Links to working examples of the alignment and cluster viewers (if remote firewall permits) can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

JESAM: CORBA software components to create and publish EST alignments and clusters.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Journal: Bioinformatics	Publication Date: Apr 1, 2000
Citations: 38

Similar Papers

Snipping polymorphisms from large EST collections in barley (Hordeum vulgare L.).
R Kota ... T Thiel
Molecular genetics and genomics : MGG | VOL. 270
R Kota, et. al.R Kota ... T Thiel
23 Aug 2003
Molecular genetics and genomics : MGG | VOL. 270

The Diatom EST Database
U Maheswari
Nucleic Acids Research | VOL. 33
U MaheswariU Maheswari
17 Dec 2004
Nucleic Acids Research | VOL. 33

Annotated ESTs from various tissues of the brown planthopper Nilaparvata lugens: a genomic resource for studying agricultural pests
Hiroaki Noda ... Kageaki Matsui
BMC Genomics | VOL. 9
Hiroaki Noda, et. al.Hiroaki Noda ... Kageaki Matsui
01 Jan 2008
BMC Genomics | VOL. 9

Advances in forest tree genomics
Christophe Plomion ... John Mackay
New Phytologist | VOL. 166
Christophe Plomion, et. al.Christophe Plomion ... John Mackay
03 May 2005
New Phytologist | VOL. 166

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

JESAM: CORBA software components to create and publish EST alignments and clusters.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics