Abstract

The calponin homology (CH) domain is a protein module of about 100 residues that was first identified at the N-terminus of calponin, an actin-binding protein playing a major regulatory role in muscle contraction. Three major groups of CH-domain-containing proteins have been recognized on the basis of sequence analysis (Gimona et al.,2002). Proteins containing a single N-terminal CH domain(1×CH proteins) include calponin itself as well as signaling proteins such as Vav, IQGAP and Cdc24. Proteins with an F-actin-binding domain (ABD)composed of two CH domains in tandem (2×CH proteins) include spectrins,dystrophin, filamins and plakins. Finally, proteins of the fimbrin/plastin family contain two ABDs in a tandem and constitute the 4×CH protein group.FIG1On the basis of the degree of sequence similarity, the CH domains can be divided into several groups. Three major classes are represented by the ABD-forming CH domains (CH1 and CH2) and by the CH3 domains characteristic of most 1×CH proteins. The CH domains of fimbrins form four further groups,which we designate as CHf1, CHf2, CHf3 and CHf4, starting with the N-terminal domain. Using the nomenclature of Gimona et al., they correspond to CH1.1,CH2.1, CH1.2 and CH2.2, respectively(Gimona et al., 2002). Typically, the CH domains that are present as a single copy (CH3) are also functionally distinct from those found in tandem pairs and in 4×CH proteins. Likewise, the N-terminal CH domains in tandem pairs (CH1) differ from the C-terminal domains (CH2) (Banuelos et al., 1998; Gimona et al.,2002). An isolated CH1 domain is able to bind to actin, but a tandem pair of CH domains is required for a fully functional ABD. The globular fold of CH domains is built by four core helices, three of them forming a loose triple helix bundle, and by one to three short helices present in the loops between the core helices. Comparison of the structures of the CH domains of fimbrin, spectrin, dystrophin, utrophin and calponin shows overall conservation of the tertiary fold, but reveals substantial differences between them that correlate with the domain classification outlined above(Bramham et al., 2002).A single CH3 domain was identified in a large number of cytoskeletal and signalling proteins in all phyla, although plants, which have several isoforms of a kinesin-like protein (KLP), are under-represented. The CH domains of 1×CH proteins are represented not only by the CH3 class but also by CH2 class, in one case by CHf2, and by the novel classes CHc, CHe and CHa found in carnitine palmitoyltransferase (CPT), RP/EB proteins and Abnormal spindle proteins (Asp), respectively (underlined in the poster). CH2 domains are present in a large group of proteins that include smoothelin and smoothelin-like proteins (HsAAH21123, DmAAD55419, CeAAL02497 and Dd02378),proteins containing a LIM and an FB3 domain (MICAL and the related proteins HsKIAA1364, HsKIAA0750 and DmAAK93415), as well as tangerin and the tangerin-related protein HsKIAA0903. None of the CH2 domains was positioned N-terminally within the protein sequence, in contrast to most CH-domain-containing proteins. Two families of microtubule-associated protein are characterized by a single CH domain. Proteins of the RP/EB family cluster together in the distinct CHe class, whereas proteins of the Asp family possess a CHa domain, which appears to be related to the typical CH1 domain. Interestingly, a single CH domain of the novel class that we designate CHc was identified in a broad family of acyltransferases that include CPT and carnitine acetyltransferase (CAT). It appears that an archetypal CH domain had already diverged early in the evolutionary history of these domains, since representatives of CH3, CHe and CHc proteins can be found in almost all phyla. Whereas particular domain combinations such as those in EB1, CPT and calponin are widely conserved, others might have been either lost or newly created. In some cases, such as mammalian ARHGEF6, only the CH domain has been acquired,as illustrated by Drosophila and C. elegans orthologues that lack a CH domain.The ABD probably arose by duplication of an archetypal CH domain and subsequent acquisition of high-affinity actin-binding properties. The ABD in turn underwent duplication at an ancient stage, as illustrated by fimbrin,which is present in all phyla. It follows that proteins with a single ABD have been lost completely in plants and almost completely in fungi: only in S. pombe has an α-actinin-like protein been described.The diversity of ABD-containing proteins arose through gene duplication events followed by shuffling and intragenic multiplication(Dubreuil, 1991).α-actinin, which is present in lower eukaryotes, is the prototype that gave rise to spectrin and dystrophin and probably also NUANCE/enaptin and plakins (Leung et al., 2001) in higher eukaryotes by multiplication of spectrin repeats. Dictyostelium filamin is the prototype that gave rise to filamins of animal species by multiplication of filamin repeats. Cortexillins and interaptin appear to be exclusive acquisitions of Dictyostelium, as are calmin and NUANCE/enaptin in mammals.Parvin/actopaxin constitutes an exception to the `CH1+CH2' rule for a single ABD. Both CH domains of parvin/actopaxin are more closely related to the CH1 domain than to any other class, and yet they diverge from the CH1 domain of the ABD. Sequence analysis suggests that these proteins arose by a duplication of the CH domain that was independent of that which gave rise to the ABD. This duplication probably took place after the branching of metazoa,because parvins are documented only in animal species. Parvins are also exceptional in that the CH domains do not appear in combination with other known domains.The 4×CH class is represented by the fimbrin/plastin family, whose members are characterized by a tandem of two ABDs. Fimbrin is the only case where an ABD is preceded by clearly defined domains, such as EF-hands. Dictyostelium is the only species where a fimbrin-related protein(Dd01313) has been identified; it lacks EF-hands but possesses a coiled-coil segment, a PH domain and a stretch similar to the N-terminal region of talins.A new combination of CH domains was found in the C. elegans4×CH protein AAK71387, where the N-terminal CH1+CH2 tandem was followed by two further domains of the CH2 type. These two additional domains probably result from a duplication of the first CH2 domain. Finally, the most intriguing was the finding of the 3×CH proteins. The CH1 and CH2 domains are followed by a highly divergent CH domain in the filamin-related proteins DmAAF46896 and CeAAB37032. No mammalian homologue of the 3×CH proteins has been found so far.The general role of CH domains is largely unclear. Originally thought to be an actin-binding unit, the CH3 domain fails to interact with F-actin(Fu et al., 2000;Gimona and Mital, 1998),although its structure can be fitted into a calponin-decorated F-actin image reconstruction in an utrophin-like mode(Bramham et al., 2002). The actin-binding property was reported to be an exclusive feature of the CH1 domain and, to a lesser extent, of the CH2 domain of the `CH1+CH2' proteins(Winder et al., 1995). It is not yet known whether the CHa, CHc, CHe, CH2 and CHf2 domains that occur in single-copy CH proteins retain their actin-binding properties. Various CH domains have also been found to harbour interaction sites for phosphatidylinositol (4,5)-bisphosphate, integrin β4, vimentin,calmodulin, paxillin, integrin-linked kinase (ILK) and the extracellular signal regulated kinase (ERK). However, only ERK has been identified as a common ligand for the CH3 domain of calponin and the ABD of α-actinin so far (Leinweber et al., 1999),which points to a potential role for CH domains as targets in signalling pathways. A common function for the CH domains of different types remains to be uncovered.For the phylogenetic analysis of CH domains we have considered representative organisms whose genomes have been fully sequenced(Arabdidopsis thaliana, Saccharomyces cerevisiae and Schizosaccharomyces pombe, Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens) or where sequencing is well advanced(Dictyostelium discoideum). We initially screened the protein databases of these organisms for proteins containing CH domains through the SMART server(http://smart.embl-heidelberg.de/). A preliminary classification on the basis of domain structure was generated. For proteins annotated as hypothetical or predicted that could not be classified into any of the known families, the corresponding EST database was screened to verify that their existence was supported by cDNA sequences. The species distribution of each subfamily was investigated by extensive screening of the non-redundant and EST databases using the BLAST and TBLASTN tools. Our catalog of proteins carrying CH domains is extensive but probably not complete. Less conserved CH domains might not have been identified and true proteins annotated as hypothetical might have been rejected owing to the absence of supporting cDNA sequences. Primary sequences of the CH domains were aligned using the ClustalW program, and the alignment was manually edited using BioEdit. Usually only one isoform of each protein family in every species was taken for the alignment. Some 2×CH proteins (dystrophin and some members of the plakin family are prominent examples) undergo alternative splicing, yielding isoforms lacking one or both CH domains. To avoid complexity, splice variants have not been considered in this study, and the longest transcribed variant is depicted in the poster. In very few cases a sequence was rejected when it was too divergent for a reliable alignment; an example is C. elegans dystrophin. The phylogenetic tree was constructed using the neighbour-joining method. For unnamed proteins, sequence names are composed of the species initials and NCBI accession number. We have conserved the domain nomenclature of the SMART server, where information about the domains can be retrieved. Exceptions are DH (Dbl-homology domain,equivalent to RhoGEF), CLR (calponin-like repeats), GSR (Gly-Ser-Arg repeats),TM (transmembrane region) and FB3 (FAD-binding domain type 3). Non-standard abbreviations for some proteins are LRNP (leucin-rich neuronal protein) and LMO (LIM only 7).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call