Abstract

Unaligned amino acid sequences can be characterized by their composition of amino acid n-tuples (i.e. doublets, triplets, quadruplets, etc.). In this study we investigated the performance of two statistics, termed commonality and specificity, that are derived from n-tuple counts using a set of G-protein coupled receptor (GPCR) sequences. The commonality of a tuple is defined as its relative occurrence in the sequences that belong to a given GPCR subtype. The specificity of a tuple is derived from its relative occurrence in the sequences of a given GPCR subtype and from its relative non-occurrence in the sequences that do not belong to this subtype. A graphical presentation, termed `polygram', is described for the visualization of common and specific tuples. The method can be applied to the classification of unknown GPCR sequences. It can also be applied to the identification of fragments of GPCRs, such as may occur in chimeric receptors. The method is generally applicable to other protein families and other types of coding.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call