Molecular evolution analysis typically involves identifying selection pressure and reconstructing evolutionary trends. This process usually requires access to specific data related to a target gene or gene family within a particular group of organisms. While recent advancements in high-throughput sequencing techniques have resulted in the rapid accumulation of extensive genomics and transcriptomics data and the creation of new databases in public repositories, extracting valuable insights from such vast data sets remains a significant challenge for researchers. Here, we elucidated the evolutionary history of THI1, a gene responsible for encoding thiamine thiazole synthase. The thiazole ring is a precursor for vitamin B1 and a crucial cofactor in primary metabolic pathways. A thorough search of complete genomes available within public repositories reveals 702 THI1 homologs of Archaea and Eukarya. Throughout its diversification, the plant lineage has preserved the THI1 gene by incorporating the N-terminus and targeting the chloroplasts. Likewise, evolutionary pressures and lifestyle appear to be associated with retention of TPP riboswitch sites and consequent dual posttranscriptional regulation of the de novo biosynthesis pathway in basal groups. Multicopy retention of THI1 is not a typical plant pattern, even after successive genome duplications. Examining cis-regulatory sites in plants uncovers two shared motifs across all plant lineages. A data mining of 484 transcriptome data sets supports the THI1 homolog expression under a light/dark cycle response and a tissue-specific pattern. Finally, the work presented brings a new look at public repositories as an opportunity to explore evolutionary trends to THI1.
Read full abstract