Abstract

ABSTRACTHalogenases create diverse natural products by utilizing halide ions and are of great interest in the synthesis of potential pharmaceuticals and agrochemicals. An increasing number of halogenases discovered in microorganisms are annotated as flavin-dependent halogenases (FDHs), but their chemical reactivities are markedly different and the genomic contents associated with such functional distinction have not been revealed yet. Even though the reactivity and regioselectivity of FDHs are essential in the halogenation activity, these FDHs are annotated inaccurately in the protein sequence repositories without characterizing their functional activities. We carried out a comprehensive sequence analysis and biochemical characterization of FDHs. Using a probabilistic model that we built in this study, FDHs were discovered from 2,787 bacterial genomes and 17 sediment metagenomes. We analyzed the essential genomic determinants that are responsible for substrate binding and subsequent reactions: four flavin adenine dinucleotide-binding, one halide-binding, and four tryptophan-binding sites. Compared with previous studies, our study utilizes large-scale genomic information to propose a comprehensive set of sequence motifs that are related to the active sites and regioselectivity. We reveal that the genomic patterns and phylogenetic locations of the FDHs determine the enzymatic reactivities, which was experimentally validated in terms of the substrate scope and regioselectivity. A large portion of publicly available FDHs needs to be reevaluated to designate their correct functions. Our genomic models establish comprehensive links among genotypic information, reactivity, and regioselectivity of FDHs, thereby laying an important foundation for future discovery and classification of novel FDHs.IMPORTANCE Halogenases are playing an important role as tailoring enzymes in biosynthetic pathways. Flavin-dependent tryptophan halogenases (Trp-FDHs) are among the enzymes that have broad substrate scope and high selectivity. From bacterial genomes and metagenomes, we found highly diverse halogenase sequences by using a well-trained profile hidden Markov model built from the experimentally validated halogenases. The characterization of genotype, steady-state activity, substrate scope, and regioselectivity has established comprehensive links between the information encoded in the genomic sequence and reactivity of FDHs reported here. By constructing models for accurate and detailed sequence markers, our work should guide future discovery and classification of novel FDHs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call