BRAGOMAP - a new Perl script for high throughoutput blast results analysis including GO and MapMan automatic annotations

Rafal Woycicki,Zbigniew Przybecki,Wojciech Gutman

doi:10.1038/npre.2009.3900.1

Abstract

AbstractAnalyzing of sequences similarities is the first and most important method used to find out the function of unknown nucleotides. Searching of homologs should be done carefully not to loose any important ones. Having thousands of results from various long-read sequencing projects (ie. differentially expressed tags, genomic polymorphons or BAC ends), the by-hand ability to retrieve interesting (to our goal) similarities in hundreds of Blast results decreases rapidly. Decreasing the number of retrieved sequences by giving more stringency in e-value threshold or displaying less results could lead to false deductions. Functional genomics, proteomics and metabolomics could give us answers to the role of nucleotide sequences. It makes the need to annotate as much of the homologies as we can, to proper molecular function, biological process and cellular component (as its proposed by widely accepted Gene Ontology Consortium annotations or MapMan mappings by Max-Planc-Institute).To facilitate fast retrieval of interesting Blast homologies and making right deductions about the biological role of sequences, in big sequencing projects, the new Perl script BRAGOMAP was written. The program make use of some of BioPerl modules as well as the power of regex text-mining in the Perl itself.The script gives us the possibility to find interesting sequence similarities by using keywords and giving points for each one found. It collects all important information from the GenBank data and puts it in different columns of tab-delimited file for further use. If we were interested (for example) in flower differentiation genes we could use the keywords (flower, ovule, anther, etc.) and/or filter all the homologies isolated from flower tissues in a special development stage. We can also filter results by choosing similarities to interesting genes or protein products. This script retrieve also all standard information from the Blast and GenBank files as Description, ACC no., E-value, Similarity positions, Query Length, Percent of Similarity etc. Automatic GO and MapMan annotations are done by looking for genes, protein products and /or DB references in the proper mappings files. Here we present the usefulness of the script in analyzing sequence similarities and annotations mapping of 3855 BAC ends obtained from the HindIII BAC genomic library of cucumber (Cucumis sativus L., line B10).

Highlights

If we were interested in flower differentiation genes we could use the keywords and/or filter all the homologies sequences isolated from flower tissues in a special development stage
Automatic annotations are done by looking for genes, protein products and /or DB references in the proper mappings files (Table 3)
*** Number of all collected points *** 6

Summary

Introduction

Analysis of sequences similarities is the first and most important method used to find out the function of unknown nucleotides. Having thousands of results from various long-read sequencing projects Differentially expressed tags, genomic polymorphons or BAC ends), the by-hand ability to retrieve interesting (to our goal) similarities in hundreds of thousands of Blast results is practically not possible. Decreasing the number of retrieved sequences by giving more stringency in e-value threshold or displaying less results would lead to false deductions.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

BRAGOMAP - a new Perl script for high throughoutput blast results analysis including GO and MapMan automatic annotations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Precedings

Lead the way for us

Journal: Nature Precedings	Publication Date: Oct 26, 2009
License type: CC BY 3.0

Similar Papers

AgBase: a unified resource for functional analysis in agriculture
F M Mccarthy ... G B Magee
Nucleic Acids Research | VOL. 35
F M Mccarthy, et. al.F M Mccarthy ... G B Magee
29 Nov 2006
Nucleic Acids Research | VOL. 35

Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.
Najmul Ikram ... Muhammad Tanvir Afzal
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15
Najmul Ikram, et. al.Najmul Ikram ... Muhammad Tanvir Afzal
18 Apr 2017
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15

Interpretable deep learning translation of GWAS and multi-omics findings to identify pathobiology and drug repurposing in Alzheimer's disease.
Jielin Xu ... Yadi Zhou
Cell reports | VOL. 41
Jielin Xu, et. al.Jielin Xu ... Yadi Zhou
01 Nov 2022
Cell reports | VOL. 41

Upstream plasticity and downstream robustness in evolution of molecular networks
Sergei Maslov ... Koon-Kiu Yan
BMC evolutionary biology | VOL. 4
Sergei Maslov, et. al.Sergei Maslov ... Koon-Kiu Yan
01 Jan 2004
BMC evolutionary biology | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

BRAGOMAP - a new Perl script for high throughoutput blast results analysis including GO and MapMan automatic annotations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Precedings