Abstract

The sequence databases now contain several particularly interesting DNA polymerase sequences for which functions have not yet been assigned. These sequences are present and transcribed in widespread eukaryotes. They are strongly similar in their polymerase domains to the familiar DNA polymerase I (PolA family) of the bacterial kingdom, and are only weakly similar to Pol-γ, the enzyme responsible for mitochondrial DNA replication. Two distinct human sequences are so far available: first, a cDNA of a class that is also represented by Caenorhabditis elegans (nematode) genomic and cDNA sequences; second, an intriguing set of rather cryptic short exon sequences interspersed in the 4p16.3 (Huntington's disease) region [1xA cosmid contig and high resolution restriction map of the 2 megabase region containing the Huntington's disease gene. Baxendale, S, MacDonald, ME, Mott, R, Francis, F, Lin, C, Kirby, SF et al. Nat Genet. 1993; 4: 181–186Crossref | PubMed | Scopus (72)See all References[1] of the genome. Partial cDNAs, encoding parts of similar DNA polymerase domains, are also available from maize, Zea mays, and the malarial parasite Plasmodium falciparum. Similar sequences are notably absent, however, from the complete genome sequence of the yeast Saccharomyces cerevisiae.Different classes of DNA-dependent DNA polymerases (EC 2.7.7.7) have well-established roles in DNA replication, repair, mutagenesis and recombination, and share structural similarity in their polymerase domains. The new sequences analysed here clearly belong to family A, as classified by polymerase domain sequence homology [[2]xCompilation and alignment of DNA polymerase sequences. Ito, J and Braithwaite, DK. Nucleic Acids Res. 1991; 19: 4045–4057Crossref | PubMed | Scopus (228)See all References, [3]xCompilation, alignment, and phylogenetic relationships of DNA polymerases. Braithwaite, DK and Ito, J. Nucleic Acids Res. 1993; 21: 787–802Crossref | PubMed | Scopus (439)See all References]. Family A polymerases have functions in: DNA repair, and RNA primer removal during lagging strand replication in bacteria (the PolA enzymes); DNA replication in eukaryotic mitochondria (the nuclear-encoded Pol-γgroup); and viral DNA replication in several bacteriophages [[4]xKornberg, A and Baker, TA. See all References, [5]xThe γsubfamily of DNA polymerases: cloning of a developmentally regulated cDNA encoding Xenopus laevis mitochondrial DNA polymerase γ. Ye, F, Carrodeguas, JA, and Bogenhagen, DF. Nucleic Acids Res. 1996; 24: 1481–1488Crossref | PubMed | Scopus (48)See all References]. All of these have 3′→5′ and/or 5′→3′ exonuclease domains on the amino-terminal side of the polymerase domain (Figure 1Figure 1).Figure 1The DNA polymerase domain in family A occurs in combination with various other domains. DNA polymerase domains are colored blue, 3′→5′ exonuclease domains orange, and 5′→3′ exonuclease domains are green. In Thermus, an inactive ‘remnant’ of the 3′→5′ exonuclease domain is present [12xCrystal structure of Thermus aquaticus DNA polymerase. Kim, Y, Eom, SH, Wang, J, Lee, DS, Suh, SW, and Steitz, TA. Nature. 1995; 376: 612–616Crossref | PubMedSee all References[12] (dotted box). Swissprot accessions: E. coli PolA:P00582; Human Pol-γ: P54098Thermus aquaticus PolA: P19821; Bacteriophage T5: P19822. The amino-terminal portion of W03A3.2 lacks clear similarity to other sequences and contains two low-complexity regions (pink circles), which potentially divide it into three globular subdomains.View Large Image | View Hi-Res Image | Download PowerPoint SlideThe most complete example of the new polymerase sequences is a conceptual translation from the C. elegans genome (coding region W03A3.2, Genbank U50184). This contains, in addition to the carboxy-terminal polymerase domain, a long amino-terminal region that does not contain any recognizable sequence similarity to known exonucleases. Six independent cDNA clones are also available from C. elegans, covering both the amino-terminal and polymerase regions. No expressed sequence tags (ESTs) from other organisms match the amino-terminal region. The two human genes of this family are distantly related to each other. One is a partial cDNA (dbEST accession W00829), and is more similar in sequence to the C. elegans homolog than to the other human gene. The second human gene has been partially assembled conceptually by sequence similarity from 13 short exons interspersed over approximately 130 kilobases of a 2 megabase contig from the Huntington's disease region of 4p16.3.Figure 2Figure 2 shows the multiple alignment of these sequences and other DNA polymerase family A representatives, including the new Z. mays and P. falciparum partial cDNAs. All these sequences contain the characteristic conserved residue patterns found in active DNA polymerase domains [6xA general structure for DNA-dependent DNA polymerases. Blanco, L, Bernad, A, Blasco, MA, and Salas, M. Gene. 1991; 100: 27–38Crossref | PubMed | Scopus (152)See all References[6], and show a much greater similarity to PolA than does Pol-γ. Phylogenetic analysis indicates that both C. elegans W03A3.2 and the human 4p16.3 sequences are placed among the deeper branches of the PolA and bacteriophage sequences (not shown). Pairwise alignments using these eukaryotic sequences show only very remote homology to the mitochondrial Pol-γgroup. The question of the evolutionary origins of these apparently ancient DNA polymerase lineages is completely open.Figure 2Alignment of the DNA polymerase domains found in C. elegans W03A3.2, the human 4p16.3 gene, E. coli DNA polymerase I (Swissprot: DP01_ECOLI/P00582), S. cerevisiae Pol-γ(Swissprot: DPOG_YEAST/P15801) and ESTs from human, maize and Plasmodium. The EST names indicate Genbank accession numbers. The two dashed segments in the 4p16.3 gene indicate exons which are presumed necessary, but for which no likely candidates were found based on sequence similarity. X denotes uncertain exon boundaries. The dash in T23354 indicates a frameshift relative to the EST sequence. The multiple alignment was constructed manually based on BLAST2 [13xLocal alignment statistics. Altschul, SF and Gish, W. Methods Enzymol. 1996; 266: 460–480Crossref | PubMed | Scopus (465)See all References[13] and TBLASTN [14xBasic local alignment search tool. Altschul, SF, Gish, W, Miller, W, Myers, EW, and Lipman, DJ. J Mol Biol. 1990; 215: 403–410PubMed | Scopus (0)See all References[14] alignments. The probability of observing the similarity to PolA by chance, computed using the BLASTP program [14xBasic local alignment search tool. Altschul, SF, Gish, W, Miller, W, Myers, EW, and Lipman, DJ. J Mol Biol. 1990; 215: 403–410PubMed | Scopus (0)See all References[14], was <10−50 for W03A3.2 and the 4p16.3 gene, and <10−6 for each of the EST translations, when searched against the NCBI NR database. The exon structure of W03A3.2 in Genbank was altered according to EST evidence.View Large Image | View Hi-Res Image | Download PowerPoint SlideRecent bacterial contamination or recent transfers from bacterial or bacteriophage sources is evidently ruled out by the presence of these genes in diverse eukaryotes and by their characteristic exon–intron genomic structures. Also, analysis by the Zinfo program [7xAtypical regions in large genomic DNA sequences. Scherer, S, McPeek, MS, and Speed, TP. Proc Natl Acad Sci USA. 1994; 91: 7134–7138Crossref | PubMed | Scopus (12)See all References[7] showed that the compositional statistics of the W03A3.2 coding sequence are typical for C. elegans but atypical for E. coli. The phylogeny and distribution, including the occurrence of at least two distantly related human paralogs, could be consistent with multiple ancient parallel transfer events as well as with a single ancient transfer, perhaps through the mitochondrial line, followed by more recent gene duplications. Some lineages, including Pol-γ, may have undergone phases of rapid evolution, together with loss of exonuclease domains and gain of other domains. Presumably, given the wide distribution in protist, plant and metazoan organisms, loss of these polymerase genes occurred in the ancestry of S. cerevisiae.What are the possible functions of these DNA polymerases? Experimental verification of polymerase activity is the next step, but in the interim it is tempting to consider DNA repair. It may be pertinent to investigate the repair of damage to mitochondrial DNA, which is poorly understood but which may have roles in the progression of cancer, diabetes and other chronic diseases [[8]xEndogenous DNA damage as related to cancer and aging. Ames, BN. Mutat Res. 1989; 214: 41–46Crossref | PubMed | Scopus (236)See all References, [9]xRepair of mitochondrial DNA damage induced by bleomycin in human cells. Shen, CC, Wertelecki, W, Driggers, WJ, LeDoux, SP, and Wilson, GL. Mutat Res. 1995; 337: 19–23Crossref | PubMed | Scopus (50)See all References, [10]xDefective repair of oxidative damage in the mitochondrial DNA of a xeroderma pigmentosum group A cell line. Driggers, WJ, Grishko, VI, LeDoux, SP, and Wilson, GL. Cancer Res. 1996; 56: 1262–1266PubMedSee all References]. A repair function is also consistent with the absence of these sequences in S. cerevisiae, as there are well-established differences in repair between this yeast and mammalian cells [11xRecognition and processing of damaged DNA. Lindahl, T. J Cell Sci. 1995; 19 (Suppl): 73–77CrossrefSee all References[11].

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call