Abstract

BackgroundThe most abundant family of insect cuticular proteins, the CPR family, is recognized by the R&R Consensus, a domain of about 64 amino acids that binds to chitin and is present throughout arthropods. Several species have now been shown to have more than 100 CPR genes, inviting speculation as to the functional importance of this large number and diversity.ResultsWe have identified 156 genes in Anopheles gambiae that code for putative cuticular proteins in this CPR family, over 1% of the total number of predicted genes in this species. Annotation was verified using several criteria including identification of TATA boxes, INRs, and DPEs plus support from proteomic and gene expression analyses. Two previously recognized CPR classes, RR-1 and RR-2, form separate, well-supported clades with the exception of a small set of genes with long branches whose relationships are poorly resolved. Several of these outliers have clear orthologs in other species. Although both clades are under purifying selection, the RR-1 variant of the R&R Consensus is evolving at twice the rate of the RR-2 variant and is structurally more labile. In contrast, the regions flanking the R&R Consensus have diversified in amino-acid composition to a much greater extent in RR-2 genes compared with RR-1 genes. Many genes are found in compact tandem arrays that may include similar or dissimilar genes but always include just one of the two classes. Tandem arrays of RR-2 genes frequently contain subsets of genes coding for highly similar proteins (sequence clusters). Properties of the proteins indicated that each cluster may serve a distinct function in the cuticle.ConclusionThe complete annotation of this large gene family provides insight on the mechanisms of gene family evolution and clues about the need for so many CPR genes. These data also should assist annotation of other Anopheles genes.

Highlights

  • The most abundant family of insect cuticular proteins, the CPR family, is recognized by the R&R Consensus, a domain of about 64 amino acids that binds to chitin and is present throughout arthropods

  • While chitin is a simple polymer of N-acetylglucosamine, there is a large number of cuticular proteins

  • Overview We have annotated 156 genes that have the potential to code for proteins with the R&R Consensus

Read more

Summary

Introduction

The most abundant family of insect cuticular proteins, the CPR family, is recognized by the R&R Consensus, a domain of about 64 amino acids that binds to chitin and is present throughout arthropods. Several species have been shown to have more than 100 CPR genes, inviting speculation as to the functional importance of this large number and diversity. While chitin is a simple polymer of N-acetylglucosamine, there is a large number of cuticular proteins (see [2,3] for review). The vast majority of cuticular protein sequences presently available (page number not for citation purposes). Throughout this paper, we will use the term, R&R Consensus, to refer to the extended Consensus and CPR to refer to the family of genes/proteins with this Consensus. The Consensus, with about 64 amino acids, almost always begins near a triad of aromatic residues (Y/ F-x-Y/F/W-x-Y/F) and terminates shortly after a uniformly conserved G-F/Y (Figure 1)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.