SeqX: a tool to detect, analyze and visualize residue co-locations in protein and nucleic acid structures

Jan C Biro,Gergely Fördös

doi:10.1186/1471-2105-6-170

Jan C Biro, Gergely Fördös

Open Access

PDF Available

https://doi.org/10.1186/1471-2105-6-170

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundThe interacting residues of protein and nucleic acid sequences are close to each other – they are co-located. Structure databases (like Protein Data Bank, PDB and Nucleic Acid Data Bank, NDB) contain all information about these co-locations; however it is not an easy task to penetrate this complex information. We developed a JAVA tool, called SeqX for this purpose.ResultsSeqX tool is useful to detect, analyze and visualize residue co-locations in protein and nucleic acid structures. The usera. selects a structure from PDB;b. chooses an atom that is commonly present in every residues of the nucleic acid and/or protein structure(s)c. defines a distance from these atoms (3–15 Å). The SeqX tool detects every residue that is located within the defined distances from the defined "backbone" atom(s); provides a DotPlot-like visualization (Residues Contact Map), and calculates the frequency of every possible residue pairs (Residue Contact Table) in the observed structure. It is possible to exclude +/- 1 to 10 neighbor residues in the same polymeric chain from detection, which greatly improves the specificity of detections (up to 60% when tested on dsDNA). Results obtained on protein structures showed highly significant correlations with results obtained from literature (p < 0.0001, n = 210, four different subsets). The co-location frequency of physico-chemically compatible amino acids is significantly higher than is calculated and expected in random protein sequences (p < 0.0001, n = 80).ConclusionThe tool is simple and easy to use and provides a quick and reliable visualization and analyses of residue co-locations in protein and nucleic acid structures.Availability and requirements SeqX, Java J2SE Runtime Environment 5.0 (available from [see Additional file 1] ) and at least a 1 GHz processor and with a minimum 256 Mb RAM. Source codes are available from the authors.

Highlights

The interacting residues of protein and nucleic acid sequences are close to each other – they are co-located
The SeqX tool detects every residue that is located within the defined distances from the defined "backbone" atom(s); provides a DotPlot-like visualization (Residues Contact Map), and calculates the frequency of every possible residue pairs (Residue Contact Table) in the observed structure
Some scientists argue that the macromolecular interactions are determined by long sequence domains that are involving many residues, while others found that there is some degree of specificity already on a single residue level, i. e. some residue pairs are preferentially co-located on interacting interfaces

Summary

Introduction

The interacting residues of protein and nucleic acid sequences are close to each other – they are co-located. Structure databases (like Protein Data Bank, PDB and Nucleic Acid Data Bank, NDB) contain all information about these co-locations; it is not an easy task to penetrate this complex information. Specific protein-protein and protein-nucleic acid interaction are in the focus of many biochemical studies. E. some residue pairs are preferentially co-located on interacting interfaces. The existence of preferred residue pairs within, as well as between, macro-molecular structures are supported by numerous statistical analyses of protein-RNA [1] regulatory proteinDNA [2], restrictions enzyme-DNA cut site [3], proteinprotein [4-8] structures and interfaces. Many studies are performed for statistical analyses of residue colocation, it was not possible for us to find a publicly available tool for this purpose. We found only a reference for the existence of a commercially available tool, the QUANTA modeling software [9,10]

Results

Discussion

Conclusion