Abstract

BackgroundThe interacting residues of protein and nucleic acid sequences are close to each other – they are co-located. Structure databases (like Protein Data Bank, PDB and Nucleic Acid Data Bank, NDB) contain all information about these co-locations; however it is not an easy task to penetrate this complex information. We developed a JAVA tool, called SeqX for this purpose.ResultsSeqX tool is useful to detect, analyze and visualize residue co-locations in protein and nucleic acid structures. The usera. selects a structure from PDB;b. chooses an atom that is commonly present in every residues of the nucleic acid and/or protein structure(s)c. defines a distance from these atoms (3–15 Å). The SeqX tool detects every residue that is located within the defined distances from the defined "backbone" atom(s); provides a DotPlot-like visualization (Residues Contact Map), and calculates the frequency of every possible residue pairs (Residue Contact Table) in the observed structure. It is possible to exclude +/- 1 to 10 neighbor residues in the same polymeric chain from detection, which greatly improves the specificity of detections (up to 60% when tested on dsDNA). Results obtained on protein structures showed highly significant correlations with results obtained from literature (p < 0.0001, n = 210, four different subsets). The co-location frequency of physico-chemically compatible amino acids is significantly higher than is calculated and expected in random protein sequences (p < 0.0001, n = 80).ConclusionThe tool is simple and easy to use and provides a quick and reliable visualization and analyses of residue co-locations in protein and nucleic acid structures.Availability and requirements SeqX, Java J2SE Runtime Environment 5.0 (available from [see Additional file 1] ) and at least a 1 GHz processor and with a minimum 256 Mb RAM. Source codes are available from the authors.

Highlights

  • The interacting residues of protein and nucleic acid sequences are close to each other – they are co-located

  • The SeqX tool detects every residue that is located within the defined distances from the defined "backbone" atom(s); provides a DotPlot-like visualization (Residues Contact Map), and calculates the frequency of every possible residue pairs (Residue Contact Table) in the observed structure

  • Some scientists argue that the macromolecular interactions are determined by long sequence domains that are involving many residues, while others found that there is some degree of specificity already on a single residue level, i. e. some residue pairs are preferentially co-located on interacting interfaces

Read more

Summary

Introduction

The interacting residues of protein and nucleic acid sequences are close to each other – they are co-located. Structure databases (like Protein Data Bank, PDB and Nucleic Acid Data Bank, NDB) contain all information about these co-locations; it is not an easy task to penetrate this complex information. Specific protein-protein and protein-nucleic acid interaction are in the focus of many biochemical studies. E. some residue pairs are preferentially co-located on interacting interfaces. The existence of preferred residue pairs within, as well as between, macro-molecular structures are supported by numerous statistical analyses of protein-RNA [1] regulatory proteinDNA [2], restrictions enzyme-DNA cut site [3], proteinprotein [4-8] structures and interfaces. Many studies are performed for statistical analyses of residue colocation, it was not possible for us to find a publicly available tool for this purpose. We found only a reference for the existence of a commercially available tool, the QUANTA modeling software [9,10]

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.