Abstract

ABSTRACT The cloning and sequencing of genes of proteins whose structure is known allows a comparison of the localization of intervening sequences with the structural features of the proteins. In some instances, the intron positions seem to demarcate structural and/or functional units, but in others, there is no obvious relationship between the protein and gene structure. Comparison with tertiary structures indicates that the intron positions in the gene correspond to the external boundary rather than the interior of the protein. This localization is confirmed by a solvent accessible surface analysis which shows that the intron junctions are almost always located at the protein-solvent interface. While the surface rule is based on observations of a few relatively small globular proteins with known tertiary structure, it is further supported by the observation that the consensus sequences at the 5′ (but not the 3′) intron/exon splice junctions predominantly code for hydrophilic and surface type amino acids. These results suggest that intron placements may have played a role in protein evolution. For example, intron junctions may be sites of hypervariability and hence of structural change. If so, such junctions would be better accommodated on the surface, rather than the interior where changes would likely result in distortion or destruction of the architecture of the molecule. Consistent with these ideas, certain evolutionary variations in the serine protease family appear to correlate with intron location. Experimental studies on the expression of the human insulin gene in heterologous mammalian cells using SV40 viral vectors show that the normal splicing can occur, but that an alternative splice site, completely within the gene, is used in certain SV40 constructions. The 3′-extragenic region appears to be the controlling factor for recruitment of the alternate site. The transcript produced from the alternate site codes for a protein that is identical to insulin at the N terminus but is completely different at the C terminal region because of a frame shift caused by the new splice. This result may serve as a model in illustrating how variations in gene splicing may generate genetic diversity as seen in the serine proteases.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call