Published in last 50 years
Articles published on Gene Structure Prediction
- Research Article
- 10.1007/s10822-025-00656-7
- Sep 3, 2025
- Journal of computer-aided molecular design
- C K V Ramesan + 3 more
The emergence of beta-lactamase producing multidrug-resistant (MDR) gram-negative bacteria presents a significant challenge to effective treatment of infections. This study focuses on the isolation, amplification, and molecular characterization of β-lactamase genes from clinical strains of Escherichia coli and Klebsiella pneumoniae. Seven new partial gene sequences, including novel variants of blaOXA and blaNDM, were identified after screening 108 clinical samples and submitted to NCBI GenBank. In silico analysis revealed considerable diversity and distribution of these resistance genes among different strains of bacteria. Gene structure predictions using GENSCAN showed that blaOXA genes typically contain single exons with moderate GC content, whereas blaNDM genes feature longer exons with higher GC content. Multiple sequence alignment showed that NDM and OXA β-lactamases were highly similar, with only slight differences in a few amino acids. The study also analyzed the physico-chemical properties, functional domains, and phosphorylation patterns of the β-lactamase proteins. Secondary structure prediction indicated a dominance of beta sheets, contributing to protein stability, while tertiary modeling provided insights into their 3D structure. Overall, these findings provide critical insights into the genetic diversity and potential mechanisms of β-lactamase-mediated resistance, offering valuable information for the development of novel therapeutic strategies and surveillance programs.
- Research Article
- 10.1093/bib/bbaf377
- Jul 2, 2025
- Briefings in bioinformatics
- Xiaomei Li + 6 more
Genome annotation is essential for understanding the functional elements within genomes. While automated methods are indispensable for processing large-scale genomic data, they often face challenges in accurately predicting gene structures and functions. Consequently, manual curation by domain experts remains crucial for validating and refining these predictions. These combined outcomes from automated tools and manual curation highlight the importance of integrating human expertise with artificial intelligence (AI) capabilities to improve both the accuracy and efficiency of genome annotation. However, the manual curation process is inherently labor-intensive and time-consuming, making it difficult to scale for large datasets. To address these challenges, we propose a conceptual framework, Human-AI Collaborative Genome Annotation (HAICoGA), that leverages the synergistic partnership between humans and AI to enhance human capabilities and accelerate the genome annotation process. Additionally, we explore the potential of integrating large language models into this framework to support and augment specific tasks. Finally, we discuss emerging challenges and outline open research questions to guide further exploration in this area.
- Research Article
- 10.9734/jabb/2025/v28i62519
- Jun 26, 2025
- Journal of Advances in Biology & Biotechnology
- Thalari Vasanthrao + 6 more
Aim: The present study focused on a comprehensive in silico investigation to identify and characterize the fatty acid desaturase gene family in Pennisetum glaucum L. (pearl millet). Fatty Acid Desaturases (FADs) are essential enzymes in plants that introduce double bonds into fatty acid chains and play a key role in the producing unsaturated fatty acids. These key enzymes identified and characterized through the Kennedy pathway play a major role in fatty acid metabolism. They significantly contribute to plant membrane fluidity, stress tolerance and oil quality. Methodology: To identify the FAD gene family NovoGene Millet database was used; their gene, promoter, complementary DNA (cDNA) and the corresponding amino acid sequences were retrieved from the database. Chromosomal distribution was predicted through mapping tools in the millet database, and synteny maps were developed through MCscanX option in the TB-tools. Gene Structure Display Server 2.0 software was used to predict gene structures, and the PlantCARE database was employed to identify the cis-regulatory elements in the putative promoter sequences of respective FAD genes. Results: A total of 22 PgFAD genes were identified and mapped across seven chromosomes of pearl millet, indicating their uneven distribution and suggesting potential duplication events. The gene structure of PgFADs was represented by various introns and exons. Cis-regulatory element analysis of 1 kb upstream promoter regions of all PgFADs highlighted the presence of various stress and phytohormone-responsive elements like STRE, WRE3, ABRE, MYB, ERE and WUN motifs indicating the potential involvement of PgFAD genes in stress and developmental regulation in plants. Synteny analysis with different cereal species such as rice (Oryza sativa), foxtail millet (Setaria italica), finger millet (Eleusine coracana), maize (Zea mays) and sorghum (Sorghum bicolor) revealed conserved evolutionary relationships and potential gene duplication events. Conclusion: This study provides valuable insights into the molecular characteristics and evolutionary relationships of FAD genes in pearl millet and lays the foundation for future functional studies and crop improvement strategies that target fatty acid metabolism.
- Research Article
- 10.3389/fpls.2025.1603268
- Jun 13, 2025
- Frontiers in Plant Science
- Yihan Wang + 3 more
Panax ginseng C. A. Meyer (ginseng) is one of the most widely used traditional Chinese medicinal herbs, with its roots as the primary medicinal part garnering significant attention due to their therapeutic potential. The GRAS [GRI (Gibberellic Acid Insensitive), RGA (Repressor of GAI-3 mutant), and SCR (Scarecrow)] genes are a class of widely distributed plant-specific transcription factors that play crucial roles in various physiological processes including root formation, fruit development, hormone signaling, and stem cell maintenance. This study systematically identified 139 GRAS genes (PgGRAS) in the ginseng genome for the first time, analyzing their complexity and diversity through protein domain structure, phylogenetic relationships, gene structure, and cis-acting element prediction. Evolutionary analysis revealed that all PgGRAS members were divided into 14 evolutionary branches, including a novel species-specific subfamily PG28, with segmental duplication being the primary driver of family expansion. RNA-seq analysis uncovered tissue-specific expression patterns of the PgGRAS gene family. qRT-PCR validation demonstrated that PgGRAS48, a member of the SCL3 subfamily, was significantly highly expressed in the main root and upregulated upon GA treatment, suggesting its potential regulatory role in main root development. Therefore, this gene was selected for further investigation. Overexpression of PgGRAS48 significantly increased the main root length in Arabidopsis thaliana (A. thaliana), accompanied by elevated endogenous GA levels. Subcellular localization, molecular docking, Bimolecular Fluorescence Complementation (BIFC) and yeast two-hybrid (Y2H) experiments confirmed the interaction between PgGRAS48 (SCL3) and PgGRAS2 (DELLA) in the nucleus, revealing the molecular mechanism by which SCL3-DELLA regulates main root elongation through gibberellin (GA) biosynthesis or signaling pathways. This study elucidates the molecular network of the GRAS family in root development in ginseng, providing key targets for the targeted improvement of root architecture in medicinal plants.
- Research Article
- 10.1111/mpp.70098
- Jun 1, 2025
- Molecular Plant Pathology
- Xizhen Yue + 17 more
ABSTRACTColletotrichum gloeosporioides is a major agricultural pathogen of crops that has also been identified as an endophyte of the medicinal plant Huperzia serrata. Both H. serrata and C. gloeosporioides produce huperzine A, a potential treatment for Alzheimer's disease. In this study, a nonpathogenic C. gloeosporioides strain (NWUHS001) was isolated and its genome sequenced. Gene structure prediction identified 15,413 protein‐coding genes and 879 noncoding RNAs. Through PHI‐base database prediction, we found that NWUHS001 lacks two key pathogenicity genes CgDN3 and cap20, which may be the cause of its nonpathogenicity. Comparative genomic analysis showed that the number of genes encoding pectin lyase B (pelB), pectin lyase (pnl) and polygalacturonase (pg) in NWUHS001 was significantly lower than that in pathogenic strains during the expansion of mycelium into host tissues. This caused slow growth and incapability to penetrate host cells. In contrast, in NWUHS001, genes involved in carbon acquisition such as ribose and amino sugar metabolic pathways were enriched, indicating active metabolite exchange with the host. In addition, by comparing the genome of NWUHS001 with that of the host H. serrata, we found that polyketosynthetase (pksIII), a key gene in the host huperzine A biosynthetic pathway, may possibly have been acquired from the fungus by horizontal gene transfer (HGT). This study explained the possible genetic evolution mechanism of C. gloeosporioides from pathogenicity to nonpathogenicity, which is of value for studying the interaction between microorganisms and plants. It also provided clues to the genetic evolution of the biosynthetic pathway of huperzine A.
- Research Article
- 10.3390/ijms26104932
- May 21, 2025
- International journal of molecular sciences
- Meiling Ming + 7 more
SQUAMOSA promoter-binding protein-like (SPL) transcription factors specific to plants are vital for regulating growth, development, secondary metabolite biosynthesis, and responses to both biotic and abiotic stresses. Despite their importance, no systematic investigations or identifications of the SPL gene family in Ginkgo biloba have been conducted. In this study, we identified 13 SPL genes within the Ginkgo biloba reference genome, spanning seven chromosomes, and categorized these genes into six groups based on their phylogenetic relationships with Arabidopsis thaliana SPL gene families. Our analysis of gene structure, conserved domains, motifs, and miR156 target predictions indicates that GbSPLs are highly conserved across evolutionary timelines. Furthermore, synteny analysis highlighted that dispersed duplication events have expanded the SPL gene family in Ginkgo biloba. Examination of the cis-regulatory elements revealed that many GbSPL genes possess motifs associated with light, hormones, and stress, implying their involvement in flavonoid biosynthesis and adaptation to environmental conditions. RNA-Seq and qRT-PCR expression profiles of GbSPL genes across various tissues and low- and high-flavonoid leaves and during both short-term and long-term water stress illustrated their roles in flavonoid biosynthesis and responses to water stress. Subcellular localization experiments showed that GbSPL2 and GbSPL11 proteins are situated within the nucleus. Our research offers the first systematic characterization of the SPL gene family in Ginkgo biloba, establishing a valuable foundation for understanding their evolutionary background and functional roles in flavonoid biosynthesis and water stress response.
- Research Article
1
- 10.71423/aimed.20250102
- Jan 25, 2025
- AI Med
- Tong Wang + 5 more
In recent years, advancements in gene structure prediction have been significantly driven by the integration of deep learning technologies into bioinformatics. Transitioning from traditional thermodynamics and comparative genomics methods to modern deep learning-based models such as CDSBERT, DNABERT, RNA-FM, and PlantRNA-FM prediction accuracy and generalization have seen remarkable improvements. These models, leveraging genome sequence data along with secondary and tertiary structure information, have facilitated diverse applications in studying gene functions across animals, plants, and humans. They also hold substantial potential for multi-application in early disease diagnosis, personalized treatment, and genomic evolution research. This review combines traditional gene structure prediction methods with advancements in deep learning, showcasing applications in functional region annotation, protein-RNA interactions, and cross-species genome analysis. It highlights their contributions to animal, plant, and human disease research while exploring future opportunities in cancer mutation prediction, RNA vaccine design, and CRISPR gene editing optimization. The review also emphasizes future directions, such as model refinement, multimodal integration, and global collaboration. By offering a concise overview and forward-looking insights, this article aims to provide a foundational resource and practical guidance for advancing nucleic acid structure prediction research.
- Research Article
2
- 10.3389/fpls.2024.1521758
- Jan 16, 2025
- Frontiers in plant science
- Priyanka Kumari + 9 more
The methylation- demethylation dynamics of RNA plays major roles in different biological functions, including stress responses, in plants. m6A methylation in RNA is orchestrated by a coordinated function of methyl transferases (writers) and demethylases (Erasers). Genome-wide analysis of genes involved in methylation and demethylation was performed in pigeon pea. Blast search, using Arabidopsis gene sequences, resulted in the identification of two methylation genes (CcMTA70, CcMTB70), two genes encoding adaptor proteins for methylation (CcFIPA and CcFIPB) and 10 demethylase (ALKBH) genes (CcALKBH1A, CcALKBH1B, CcALKBH1C, CcALKBH2, CcALKBH8, CcALKBH8A, CcALKBH8B, CcALKBH9, CcALKBH10A and CcALKBH10B) in the pigeon pea genome. The identified genes were analyzed through phylogenetic relationship, chromosomal position, gene structure, conserved motif, domain and subcellular location prediction etc. These structural analyses resulted in categorization of MTs and FIPs into one group, i.e., CcMTA/B and CcFIPA/B, respectively; and ALKBHs into four groups, viz. CcALKBH1/2, CcALKBH8, CcALKBH9 and CcALKBH10. Relative expression analysis of the identified genes in various tissues at different developmental stages revealed the highest level of expression in leaf and the least in root. CcMTs and CcFIPs had similar patterns of expression, and CcALKBH10B demonstrated the highest and CcALKBH2 the lowest level of expression in all the tissues analyzed. CcALKBH8 showed the highest induction in expression upon exposure to heat stress, and CcALKBH10B demonstrated the highest level of induction in expression during drought, salt and biotic (Helicoverpa armigera infestation) stresses. The present study would pave the way for detailed molecular characterization of m6A methylation in pigeon pea and its involvement in stress regulation.
- Research Article
1
- 10.1186/s12863-024-01285-z
- Dec 18, 2024
- BMC Genomic Data
- Pollob Shing + 6 more
BackgroundGossypium raimondii serves as a widely used genomic model cotton species. Its genetic influence to enhance fiber quality and ability to adapt to challenging environments both contribute to increasing cotton production. The formins are a large protein family that predominately consists of FH1 and FH2 domains. The presence of the formin domains highly regulates the actin and microtubule filament in the cytoskeleton dynamics confronting various abiotic stresses such as drought, salinity, and cold temperatures.ResultsIn this study, 26 formin genes were analyzed and characterized in G. raimondii and mostly were found in the nucleus and chloroplast. According to the evolutionary phylogenetic relationship, GrFH were dispersed and classified into seven different groups and shared an ancestry relationship with MtFH. The GrFH gene structure prediction revealed diverse intron-exon arrangements between groups. The FH2 conserved domain was found in all the GrFH distributed on 12 different chromosomes. Moreover, 11 pairs of GrFH transpired segmental duplication. Among them, GrFH4-GrFH7 evolved 35 million years ago (MYA) according to the evolutionary divergence time. Besides, 57 cis-acting regulatory elements (CAREs) motifs were found to play a potential role in plant growth, development, and in response to various abiotic stresses, including cold stress. The GrFH genes mostly exhibited biological processes resulting in the regulation of actin polymerization. The ERF, GATA, MYB, and LBD, major transcription factors (TFs) families in GrFH, regulated expression in abiotic stress specifically salt as well as defense against certain pathogens. The microRNA of GrFH unveiled the regulatory mechanism to regulate their gene expression in abiotic stresses such as salt and cold. One of the most economic aspects of cotton (G.raimondii) is the production of lint due to its use in manufacturing fabrics and other industrial applications. The expression profiles of GrFH in different tissues particularly during the conversion from ovule to fiber (lint), and the increased levels (up-regulation) of GrFH4, GrFH6, GrFH12, GrFH14, and GrFH26 under cold conditions, along with GrFH19 and GrFH26 in response to salt stress, indicated their potential involvement in combating these environmental challenges. Moreover, these stress-tolerant GrFH linked to cytoskeleton dynamics are essential in producing high-quality lint.ConclusionsThe findings from this study can contribute to elucidating the evolutionary and functional characterizations of formin genes and deciphering their potential role in abiotic stress such as cold and salt as well as in the future implications in wet lab.
- Research Article
- 10.36347/sjet.2024.v12i07.005
- Jul 25, 2024
- Scholars Journal of Engineering and Technology
- Wang Wenyi + 2 more
Objective: To analyze the structure and properties of Klebsiella pneumoniae and its encoded proteins. Methods: To analyze and predict the gene of Klebsiella pneumoniae and to analyze and predict the sequence and structure of its encoded proteins by using various information analysis tools in NCBI, ExPASy and other websites. The analysis of the gene sequence (homology analysis, multiple sequence comparison, conserved region analysis, gene structure prediction, gene annotation, enzyme cleavage site analysis, primer design, six-frame translation, etc.), protein sequence analysis and structure prediction (primary structure analysis, subcellular localization, signal peptide, transmembrane information, secondary structure prediction, three-dimensional structural homology modeling, etc.), molecular phylogenetic analysis (constructing phylogenetic tree), and molecular phylogenetic analysis (constructing phylogenetic tree). Conclusion: We successfully analyzed and predicted the sequence and structure of Klebsiella pneumoniae and its encoded proteins, with a view to providing reference for the in-depth study of the biological properties of SapC, the ABC transporter permease of Klebsiella pneumoniae, the establishment of a rapid detection method for the bacterium, and the selection of targets for subunit and nucleic acid vaccines, and laying a foundation for the further understanding and utilization of this gene.
- Research Article
1
- 10.7717/peerj.17737
- Jul 17, 2024
- PeerJ
- Andrés G López-Virgen + 8 more
Mango is a popular tropical fruit that requires quarantine hot water treatment (QHWT) for postharvest sanitation, which can cause abiotic stress. Plants have various defense mechanisms to cope with stress; miRNAs mainly regulate the expression of these defense responses. Proteins involved in the biogenesis of miRNAs include DICER-like (DCL), ARGONAUTE (AGO), HYPONASTIC LEAVES 1 (HYL1), SERRATE (SE), HUA ENHANCER1 (HEN1), HASTY (HST), and HEAT-SHOCK PROTEIN 90 (HSP90), among others. According to our analysis, the mango genome contains five DCL, thirteen AGO, six HYL, two SE, one HEN1, one HST, and five putative HSP90 genes. Gene structure prediction and domain identification indicate that sequences contain key domains for their respective gene families, including the RNase III domain in DCL and PAZ and PIWI domains for AGOs. In addition, phylogenetic analysis indicates the formation of clades that include the mango sequences and their respective orthologs in other flowering plant species, supporting the idea these are functional orthologs. The analysis of cis-regulatory elements of these genes allowed the identification of MYB, ABRE, GARE, MYC, and MeJA-responsive elements involved in stress responses. Gene expression analysis showed that most genes are induced between 3 to 6 h after QHWT, supporting the early role of miRNAs in stress response. Interestingly, our results suggest that mango rapidly induces the production of miRNAs after heat stress. This research will enable us to investigate further the regulation of gene expression and its effects on commercially cultivated fruits, such as mango, while maintaining sanitary standards.
- Research Article
2
- 10.1038/s41597-024-03549-w
- Jul 13, 2024
- Scientific Data
- Xingyu Yang + 11 more
Fructus hippophae (Hippophae rhamnoides spp. mongolica×Hippophae rhamnoides sinensis), a hybrid variety of sea buckthorn that Hippophae rhamnoides spp. mongolica serves as the female parent and Hippophae rhamnoidessinensis serves as the male parent, is a traditional plant with great potentials of economic and medical values. Herein, we gained a chromosome-level genome of Fructus hippophae about 918.59 Mb, with the scaffolds N50 reaching 83.65 Mb. Then, we anchored 440 contigs with 97.17% of the total genome sequences onto 12 pseudochromosomes. Next, de-novo, homology and transcriptome assembly strategies were adopted for gene structure prediction. This predicted 36475 protein-coding genes, of which 36226 genes could be functionally annotated. Simultaneously, various strategies were used for quality assessment, both the complete BUSCO value (98.80%) and the mapping rate indicated the high assembly quality. Repetitive elements, which occupied 63.68% of the genome, and 1483600 bp of non-coding RNA were annotated. Here, we provide genomic information on female plants of a popular variety, which can provide data for pan-genomic construction of sea buckthorn and for the resolution of the mechanism of sex differentiation.
- Research Article
- 10.1111/aab.12926
- Jun 30, 2024
- Annals of Applied Biology
- Jhumishree Meher + 2 more
Abstract WRKY transcription factor (TF) family is well known to govern essential physiological functioning as well as regulate plant response to biotic and abiotic stress. In this study, we have identified 108 OsWRKY genes in the genome of Oryza sativa subsp. japonica, using the updated genomic data from the Rice Annotation Project Database and Oryzabase, which were further used to conduct the phylogenetic study, motif analysis, gene structure analysis, chromosomal mapping, and prediction of sub‐cellular localization. The multiple sequence alignment OsWRKY proteins revealed the presence of nine different types of alterations in the conserved heptapeptide sequence WRKYGQK associated with 19 OsWRKY genes. Physiochemical analysis discloses the hydrophobic amino acid‐rich, thermally stable, and polar nature of OsWRKY proteins. These genes were noted as highly conserved between the two cultivated sub‐species of Oryza sativa, that is Indica and japonica type. Additionally, from motif analysis, we have found a new motif, which was categorized as hAT family C‐terminal dimerization region associated with four members of group IIc. We have identified 21 stress‐responsive OsWRKY genes, and their significance to the different biotic and abiotic stress‐mediated cascades was further evaluated by analysing 1500 kb upstream sequences and this disclosed the presence of important phytohormone‐responsive cis‐elements in the OsWRKY gene, suggesting its direct involvement in defence against a wide range of external stressors and these 21 OsWRKY genes are tentatively listed as possible candidates for more study.
- Research Article
3
- 10.1111/jeu.13038
- Jun 27, 2024
- The Journal of eukaryotic microbiology
- Eric Peyretaillade + 4 more
Since the advent of sequencing techniques and due to their continuous evolution, it has become easier and less expensive to obtain the complete genome sequence of any organism. Nevertheless, to elucidate all biological processes governing organism development, quality annotation is essential. In genome annotation, predicting gene structure is one of the most important and captivating challenges for computational biology. This aspect of annotation requires continual optimization, particularly for genomes as unusual as those of microsporidia. Indeed, this group of fungal-related parasites exhibits specific features (highly reduced gene sizes, sequences with high rate of evolution) linked to their evolution as intracellular parasites, requiring the implementation of specific annotation approaches to consider all these features. This review aimed to outline these characteristics and to assess the increasingly efficient approaches and tools that have enhanced the accuracy of gene prediction for microsporidia, both in terms of sensitivity and specificity. Subsequently, a final part will be dedicated to postgenomic approaches aimed at reinforcing the annotation data generated by prediction software. These approaches include the characterization of other understudied genes, such as those encoding regulatory noncoding RNAs or very small proteins, which also play crucial roles in the life cycle of these microorganisms.
- Research Article
1
- 10.1371/journal.pbio.3002546
- Mar 11, 2024
- PLOS Biology
- Cristina Sarasa-Buisan + 6 more
Bacteria have developed fine-tuned responses to cope with potential zinc limitation. The Zur protein is a key player in coordinating this response in most species. Comparative proteomics conducted on the cyanobacterium Anabaena highlighted the more abundant proteins in a zur mutant compared to the wild type. Experimental evidence showed that the exoprotein ZepA mediates zinc uptake. Genomic context of the zepA gene and protein structure prediction provided additional insights on the regulation and putative function of ZepA homologs. Phylogenetic analysis suggests that ZepA represents a primordial system for zinc acquisition that has been conserved for billions of years in a handful of species from distant bacterial lineages. Furthermore, these results show that Zur may have been one of the first regulators of the FUR family to evolve, consistent with the scarcity of zinc in the ecosystems of the Archean eon.
- Research Article
3
- 10.3389/fpls.2023.1285488
- Nov 3, 2023
- Frontiers in Plant Science
- Xiaohong Li + 7 more
Alfalfa is an excellent leguminous forage crop that is widely cultivated worldwide, but its yield and quality are often affected by drought and soil salinization. Hyperosmolality-gated calcium-permeable channel (OSCA) proteins are hyperosmotic calcium ion (Ca2+) receptors that play an essential role in regulating plant growth, development, and abiotic stress responses. However, no systematic analysis of the OSCA gene family has been conducted in alfalfa. In this study, a total of 14 OSCA genes were identified from the alfalfa genome and classified into three groups based on their sequence composition and phylogenetic relationships. Gene structure, conserved motifs and functional domain prediction showed that all MsOSCA genes had the same functional domain DUF221. Cis-acting element analysis showed that MsOSCA genes had many cis-regulatory elements in response to abiotic or biotic stresses and hormones. Tissue expression pattern analysis demonstrated that the MsOSCA genes had tissue-specific expression; for example, MsOSCA12 was only expressed in roots and leaves but not in stem and petiole tissues. Furthermore, RT-qPCR results indicated that the expression of MsOSCA genes was induced by abiotic stress (drought and salt) and hormones (JA, SA, and ABA). In particular, the expression levels of MsOSCA3, MsOSCA5, MsOSCA12 and MsOSCA13 were significantly increased under drought and salt stress, and MsOSCA7, MsOSCA10, MsOSCA12 and MsOSCA13 genes exhibited significant upregulation under plant hormone treatments, indicating that these genes play a positive role in drought, salt and hormone responses. Subcellular localization results showed that the MsOSCA3 protein was localized on the plasma membrane. This study provides a basis for understanding the biological information and further functional analysis of the MsOSCA gene family and provides candidate genes for stress resistance breeding in alfalfa.
- Research Article
8
- 10.1093/dnares/dsad017
- Jul 21, 2023
- DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes
- Takeaki Taniguchi + 9 more
The prediction of gene structure within the genome sequence is the starting point of genome analysis, and its accuracy has a significant impact on the quality of subsequent analyses. Gene structure prediction is roughly divided into RNA-Seq-based methods, ab initio-based methods, homology-based methods, and the integration of individual prediction methods. Integrated methods are mainstream in recent genome projects because they improve prediction accuracy by combining or taking the best individual prediction findings; however, adequate prediction accuracy for eukaryotic species has not yet been achieved. Therefore, we developed an integrated tool, GINGER, that solves various issues related to gene structure prediction in higher eukaryotes. By handling artefacts in alignments of RNA and protein sequences, reconstructing gene structures via dynamic programming with appropriately weighted and scored exon/intron/intergenic regions, and applying different prediction processes and filtering criteria to multi-exon and single-exon genes, we achieved a significant improvement in accuracy compared to the existing integration methods. The feature of GINGER is its high prediction accuracy at the gene and exon levels, which is pronounced for species with more complex gene architectures. GINGER is implemented using Nextflow, which allows for the efficient and effective use of computing resources.
- Research Article
7
- 10.1111/ppl.13877
- Mar 1, 2023
- Physiologia Plantarum
- Sara Sangi + 5 more
Callose is a polymer deposited on the cell wall and is necessary for plant growth and development. Callose is synthesized by genes from the glucan synthase-like family (GSL) and dynamically responds to various types of stress. Callose can inhibit pathogenic infection, in the case of biotic stresses, and maintain cell turgor and stiffen the plant cell wall in abiotic stresses. Here, we report the identification of 23 GSL genes (GmGSL) in the soybean genome. We performed phylogenetic analyses, gene structure prediction, duplication patterns, and expression profiles on several RNA-Seq libraries. Our analyses show that WGD/Segmental duplication contributed to expanding this gene family in soybean. Next, we analyzed the callose responses in soybean under abiotic and biotic stresses. The data show that callose is induced by both osmotic stress and flagellin 22 (flg22) and is related to the activity of β-1,3-glucanases. By using RT-qPCR, we evaluated the expression of GSL genes during the treatment of soybean roots with mannitol and flg22. The GmGSL23 gene was upregulated in seedlings treated with osmotic stress or flg22, showing the essential role of this gene in the soybean defense response to pathogenic organisms and osmotic stress. Our results provide an important understanding of the role of callose deposition and regulation of GSL genes in response to osmotic stress and flg22 infection in soybean seedlings.
- Research Article
- 10.1016/j.fsi.2023.108642
- Feb 27, 2023
- Fish & Shellfish Immunology
- Yang Mao + 5 more
Comparative genomics studies on the stk gene family in vertebrates: From the bighead carp (Hypophthalmichthys nobilis) genome
- Research Article
5
- 10.7717/peerj.14844
- Feb 13, 2023
- PeerJ
- Laipeng Zhao + 5 more
Wild tomato germplasm is a valuable resource for improving biotic and abiotic stresses in tomato breeding. The HVA22 is widely present in eukaryotes and involved in growth and development as well as stress response, such as cold, salt, drought, and biotic stress. In the present study, we identified 45 HVA22 genes in three wild species of tomatoes. The phylogenetic relationships, gene localization to chromosomes, gene structure, gene collinearity, protein interactions, and cis-acting element prediction of all 45 HVA22 genes (14 in Solanum pennellii, 15 in S. pimpinellifolium, and 16 in S. lycopersicoides) were analyzed. The phylogenetic analysis showed that the all HVA22 proteins from the family Solanaceae were divided into three branches. The identified 45 HVA22 genes were grouped into four subfamilies, which displayed similar number of exons and expanded in a fragmentary replication manner. The distribution of HVA22 genes on the chromosomes of the three wild tomato species was also highly similar. RNA-seq and qRT-PCR revealed that HVA22 genes were expressed in different tissues and induced by drought, salt, and phytohormone treatments. These results might be useful for explaining the evolution, expression patterns, and functional divergence of HVA22 genes in Lycopersicon.