Articles published on Genome Sequencing Technologies
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
900 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.vetmic.2026.110989
- May 1, 2026
- Veterinary microbiology
- Kaiyuan Ye + 6 more
Molecular characterization and pathogenicity of a porcine Teschovirus 5 isolate in Shandong Province in China.
- Research Article
- 10.1038/s41597-026-07143-0
- Mar 31, 2026
- Scientific data
- Ke Jiang + 10 more
This study leveraged advanced genomic sequencing technologies, including Oxford Nanopore technology long reads, Pacific Biosciences HiFi sequencing reads, BGISEQ T7 short reads, and high-throughput chromatin conformation capture technology, to demonstrate the telomere-to-telomere genome assembly and annotation of the Zi goose, a distinctive breed in the cold regions of China. The assembled genome spans 1,305,488,053 bp with exceptional continuity (contig N50: 54,135,724 bp; scaffold N50: 81,055,700 bp). A total of 1,219,987,289 bp were anchored to 40 chromosomes (38 autosomes and a Z/W sex chromosome pair), with 13 chromosomes assembled gap-free, a BUSCO completeness score of 96.8%, and a protein completeness score of 91.7%. Genome annotation predicted a total of 18,359 protein-coding genes, among which 17,592 have been functionally characterized. In addition, 260,266,197 bp of sequences were identified as repetitive sequences, and 655,314 bp were identified as noncoding RNA sequences. This high-quality genome assembly provides valuable genetic resources and a theoretical basis for dissecting the important economic traits of the Zi goose and promoting the development of unique agriculture in cold regions.
- Research Article
- 10.3390/foods15061089
- Mar 20, 2026
- Foods (Basel, Switzerland)
- Qian Chen + 6 more
This study aimed to analyze the whole genome sequencing of Lactobacillus (Lb) sakei HRB10, which was isolated from traditional dry sausage, to investigate its genetic traits and metabolic processes. The study revealed that the genome total length of Lb sakei HRB10 was 1987622 base pairs (bp), containing 1906 genes and a Genomic Component (GC) percentage of 41.11%. Database annotations indicate that the primary pathways in the genome of Lb sakei HRB10 are amino acid, fatty acid, and carbohydrate metabolisms. These pathways are crucial in forming the distinct flavor in dry sausage. There are many annotated genes encoding enzymes associated with amino acid and carbohydrate metabolisms, but there is a limited number of annotated genes encoding enzymes associated with fatty acid metabolism. Comparative genomics analysis results showed that the length of Lb. sakei HRB10 genomes were in the range of 1.93-2.07 Mb, and the GC content was 41.05-41.22%. The phylogenetic tree results and average nucleotide identity showed a very high homology between Lb. sakei HRB10, MFPB19, and TMW-1.3. This study provides knowledge to understand the formation mechanism of flavor formation by Lb. sakei HRB10 in dry sausages, thereby facilitating the identification of promising strains for application in meat fermentation.
- Research Article
- 10.1038/s41597-026-06729-y
- Mar 19, 2026
- Scientific data
- Jisol Lee + 7 more
With the rapid advancements in genome sequencing technologies, microbial genome data has exponentially increased, making it essential to continuously update dataset for accurate microbial identification and classification. We present the development of Microbial Identification using rRNA Operon Region (MIrROR) release 02, an expanded dataset based on 1,690,470 genomes (1,674,514 bacterial and 15,956 archaeal) sourced from NCBI. The final curated dataset covers 476,579 sequences, 249,907 genomes, and 29,051 species, representing increases of 387.39%, 472.49%, and 206.28% over the previous release. Key updates include the addition of archaeal genomes and taxonomy reclassification based on GTDB R220. Extensive curation was performed, including filtering operon lengths (3,500-7,000 bp), removing duplicate sequences, eliminating sequences with ambiguous nucleotides, and clustering of sequences at 99% identity to remove redundancies. The updated dataset showed improved performance in microbial mock community analyses, supporting its accuracy and reliability. These improvements make MIrROR release 02 a valuable resource for microbial profiling and various microbiological research applications.
- Research Article
- 10.1093/ismeco/ycag048
- Mar 6, 2026
- ISME Communications
- Maria Alvarez-Sanchez + 9 more
Marine viruses impact biogeochemical cycles through cell lysis, releasing organic matter and nutrients that fuel ocean productivity. Identifying and quantifying the specific viruses active in these processes remain a priority in the field. Here, we introduce a click-chemistry method to fluorescently label, sort, and sequence the genomes of newly produced viral particles (viral progeny) released from transcriptionally active host microbial cells, alongside the analysis of co-occurring inactive cells and pre-existing viruses in environmental samples. This approach, called viral bioorthogonal noncanonical amino acid tagging (BONCAT)-fluorescence-activated cell sorting (FACS), combines BONCAT with environmental sample incubation, followed by single-virus and single-cell sorting by flow cytometry (FACS). Genomic analysis of translationally active cells and new viral progeny in coastal seawater incubations confirmed BONCAT labeling and successful sorting of diverse marine bacteria, microeukaryotic cells, and virioplankton, with stark differences in the predicted turnover of specific groups of infecting viruses, including pelagiphages, methylophages, a Flavobacteriales-associated novel “Far-T4” clade, noncanonical DNA viruses of Naomiviridae using dU instead of dT, algae-infecting giant NCLDV viruses, and parasitic virophages. Sequenced BONCAT-active cells showed a strong enrichment in viral contigs relative to the inactive cell fraction, suggestive of a large proportion of translationally active virocells. This study illustrates the effectiveness of viral BONCAT-FACS for uncovering genome-resolved virus–host dynamics. By providing a direct approach for tracking active viral infections in natural environments, this method enhances our ability to investigate behavior and interactions of these nanoscale predators, expanding our understanding of their role in ecosystem dynamics.
- Research Article
- 10.16288/j.yczz.25-228
- Mar 1, 2026
- Yi chuan = Hereditas
- Wen-Xuan Zhou + 10 more
With the widespread adoption of genome sequencing technologies, predicting complex traits using genomic markers has become a key component in breeding programs. However, the high dimensionality and sparsity of genomic data, along with the complex nonlinear interactions among genetic markers, significantly increase the difficulty of accurate data analysis and the cost of hardware deployment. Therefore, this study proposes a chromosome-encoded multi-head self-attention model, named ChrFormer, for genomic prediction. The model employs a chromosome encoder to compress whole-genome SNP data into 20 chromosome-specific feature vectors and one global feature vector. It leverages the multi-head self-attention mechanism to dynamically capture long-range interactive effects across chromosomes, and a multilayer perceptron (MLP) precisely predicts phenotype from the refined genomic features. The study selected genotyping data from 50,000 SNPs of 4,875 Large White pigs, along with four key production traits, including backfat thickness at 100 kg and 115 kg, and age at 100 kg and 115 kg. A ten-fold cross-validation approach and the Pearson correlation coefficient were used to evaluate prediction accuracy. The predictive performance of ChrFormer was systematically compared with genomic best linear unbiased prediction (GBLUP), Bayesian method A (BayesA), and representative deep learning methods, including the visual geometry group (VGG) network and the feedforward neural network (FNN). Furthermore, the study analyzed the strengths and weaknesses of each deep learning model from multiple aspects, including the number of model parameters, training time, and the extent of overfitting. The results show that ChrFormer significantly outperforms the VGG and FNN deep learning models in predictive accuracy across all tested traits. For three of the traits (backfat thickness at 100 kg and 115 kg, and days to 115 kg), its prediction accuracy surpasses that of the traditional GBLUP and BayesA methods. Although ChrFormer requires a longer training time per iteration (54.88 s), its number of parameters is only about one-tenth of that of VGG and FNN, and it demonstrates more stable resistance to overfitting. These results demonstrate that the self-attention-based ChrFormer is a practical tool for genomic phenotype prediction in animal breeding, and its lightweight architecture and stable performance offer a readily deployable solution for breeding stations with limited computational resources.
- Research Article
- 10.3390/plants15050748
- Feb 28, 2026
- Plants (Basel, Switzerland)
- Qin Zhao + 8 more
Carnation is one of the most popular ornamental flowers worldwide. Due to its high ornamental and economic value, breeding techniques have advanced rapidly, leading to the continuous emergence of new varieties. However, this has also resulted in issues such as synonymy and homonymy. Therefore, utilizing DNA fingerprinting for rapid and accurate variety identification can play a crucial role in germplasm identification and the resolution of intellectual property disputes. In this study, we performed reduced-representation genome sequencing on 50 carnation accessions to develop single nucleotide polymorphism (SNP) markers. After filtering, 82,584 high-quality SNPs were obtained. These SNPs were used to conduct principal component analysis, population structure analysis, and cluster analysis on the 50 carnation accessions. From these high-quality SNPs, 130 SNP loci were further selected and converted into Kompetitive Allele-Specific PCR (KASP) markers. Preliminary screening using 92 carnation accessions yielded 53 KASP markers, and a subsequent screening with 217 carnation accessions identified 45 core KASP markers. Using these core markers, a fingerprint database was successfully constructed for 309 carnation accessions, achieving a distinguishing power of 99.987%. This study employed SNP fingerprinting and genetic analysis for the screening and identification of carnations, broadening the genetic basis at the molecular level and supporting subsequent variety protection efforts, thereby providing a scientific basis for carnation selection and identification.
- Research Article
- 10.1111/coin.70206
- Feb 26, 2026
- Computational Intelligence
- Zhi‐Kang Bao + 3 more
ABSTRACT Actinomycetes are a class of microbial resources with significant practical value, capable of producing secondary metabolites such as antibiotics, enzyme inhibitors, and amino acids. With the advancement of genome sequencing technologies, the amount of DNA sequence data for actinomycetes has increased exponentially. The classification of actinomycete DNA sequences aims to predict their taxonomic categories, thereby determining whether an actinomycete belongs to a new or known species, which is of great importance for assessing its potential applications in medicine, agriculture, and industry. In this study, a nucleotide‐based digital feature extraction method was first applied to obtain the structural and informational characteristics of actinomycete DNA sequences, providing a complete feature dataset for subsequent classification research. Then, a convolutional neural network (CNN) model suitable for the classification of actinomycete genomic DNA sequences was constructed. On this basis, two hybrid models were proposed—one combining the CNN with a long short‐term memory network (CNN‐LSTM) and the other combining the CNN with a bidirectional recurrent neural network (CNN‐BiLSTM). These hybrid models were implemented through fully connected layers and a sigmoid classifier to perform DNA sequence classification prediction. Experimental results showed that the CNN model achieved a classification accuracy of 84.43% with a loss rate of 35.79%, the CNN‐LSTM model achieved an accuracy of 83.92% with a loss rate of 36.82%, and the CNN‐BiLSTM model achieved an accuracy of 86.25% with a loss rate of 30.81%. Further validation experiments demonstrated that the CNN model reached an accuracy of 84.56% and a loss rate of 35.68%, the CNN‐LSTM model achieved an accuracy of 84.13% and a loss rate of 36.16%, and the CNN‐BiLSTM model achieved a classification accuracy of 87.36% with a loss rate of 29.69%. These results indicate that the CNN‐BiLSTM model is more suitable for DNA classification prediction, effectively improving classification accuracy and enabling accurate classification of complete actinomycete genomic DNA sequences.
- Research Article
- 10.3389/fmicb.2026.1730485
- Feb 18, 2026
- Frontiers in Microbiology
- Xuxia Cui + 7 more
Background Acinetobacter baumannii ( A. baumannii ) has posed a serious threat to the global healthcare environment due to its widespread multidrug resistance. However, the long-term molecular epidemiological characteristics, drug resistance profiles and genomic characteristics of A. baumannii isolates in Guangzhou, China have not been fully elucidated. This study aims to systematically analyze these characteristics using Acinetobacter baumannii strains from a local tertiary hospital. Methods A total of 98 non-repeating clinical isolates of A. baumannii collected between 2013 and 2021 were analyzed in the study. Whole genome sequencing technology (Illumina NovaSeq 6,000 platform) was also used for multi-locus sequence typing (MLST), resistance genomic/virulence genomic analysis (based on the CARD/VFDB database), plasmid screening (with the PlasmidFinder tool), and pan-genomic analysis (via the Roary tool). Results Among the 21 identified STs, ST2 was the dominant lineage, accounting for 66.3% (65/98) of all isolates, indicating the establishment of a predominant epidemic clone. Compared with non-ST2 strains, ST2 isolates exhibited a significantly higher rate of carbapenem resistance (95.38%) and carried a higher burden of resistance determinants, including bla OXA-23 , bla ADC-25 , tet(B) , and multiple aminoglycoside resistance genes. Notably, ST2 strains harbored a highly conserved and dominant repertoire of virulence factors, particularly those involved in iron acquisition and host adaptation, such as ompA , abaI , and the complete siderophore synthesis and uptake systems ( basA – basJ , bauA – bauF , and entE ). These features likely confer enhanced survival, persistence, and transmissibility in the hospital environment, supporting the classification of ST2 as a high-risk epidemic clone. Consistent with this, genomic clustering and temporal aggregation of ST2 isolates suggested sustained intrahospital transmission during the study period. Pangenome analysis revealed that A. baumannii possesses a large accessory genome (76.4%), reflecting substantial genomic plasticity that may facilitate rapid adaptation to antimicrobial and host-derived selective pressures. Discussion As the first long-term genomic epidemiological study of A. baumannii in Guangzhou, our findings confirm that ST2 is the predominant multidrug-resistant and outbreak-prone lineage, driven by the convergence of resistance gene accumulation, virulence optimization, and genomic flexibility. These results underscore the urgent need to strengthen infection control measures and antimicrobial stewardship to curb the continued spread of this high-risk clone.
- Research Article
- 10.1111/1440-1703.70051
- Feb 17, 2026
- Ecological Research
- Yu Sato + 3 more
ABSTRACT Recent advances in whole genome sequencing (WGS) technology, particularly long‐read sequencing, have enabled the development of high‐precision reference genome assemblies for non‐model wild mammals and birds. The decreasing costs of WGS facilitate numerous global genomic initiatives, and genetic analysis based on genomic data is imperative for population genetics and conservation genetics. Genomic analysis provides precise insights into genetic diversity and inbreeding in endangered animals, but requires a high‐quality genome assembly. The absence of such assemblies can lead to a biased understanding of genetic diversity and misdirected conservation strategies. In this study, we developed high‐precision genome assemblies for two endangered Japanese animals, the Okinawa rail and the Japanese golden eagle, using a hybrid approach that combines short‐ and long‐read sequencing. This approach improved assembly contiguity, reduced missing data, and enhanced completeness. We also assessed how assembly quality influences genetic analysis by comparing results from population genetic analyses based on previous and newly established assemblies. The findings of this assessment indicated that genome‐wide heterozygosity and PSMC modeling were less sensitive to assembly quality. However, inbreeding analysis based on runs of homozygosity (ROH) was significantly affected by fragmentation of assembly. Consequently, high‐precision, contiguous assemblies are essential for accurate conservation genetic analyses, particularly for assessing inbreeding. In the absence of a high‐quality assembly, developing new ones is a viable alternative. Our hybrid approach combining Nanopore long‐read sequencing and short‐read sequencing enables the cost‐effective development of high‐quality genome assemblies, making it suitable for non‐model animals.
- Research Article
- 10.48047/jocaaa.2026.35.02.29
- Feb 1, 2026
- Journal of Computational Analysis and Applications
- Lakshmi Vara Prasad Adusumilli
Pharmaceutical companies, faced with stiff competition from the rapid progression of genomic medicineand next-generation sequencing (NGS) technology adoption, are beginning to move from their customary mass manufacturing model to a patient-centric therapeutic model
- Research Article
- 10.1097/jxx.0000000000001235
- Feb 1, 2026
- Journal of the American Association of Nurse Practitioners
- Laurie M Connors
Newborn screening (NBS) is one of the most significant public health achievements, traditionally using biochemical and physiologic tests to detect rare but treatable conditions. The emergence of genomic sequencing technologies, including whole-genome sequencing (WGS), now offers the ability to identify thousands of variants underlying pediatric disorders. These advances create new opportunities to transform NBS but also raise important clinical, ethical, and policy challenges. This article explores the implications of genomic NBS for advanced practice nursing, with a focus on lessons from the NIH-funded BabySeq Project and the policy precedent set by Florida's 2025 Sunshine Genetics Act, the first statewide, publicly funded WGS-NBS program in the United States. BabySeq demonstrated that genomic sequencing can identify actionable variants in approximately 9% of infants, yet it also exposed ethical tensions regarding disclosure of adult-onset findings and the contested concept of "family benefit." Florida's Sunshine Genetics Act expands NBS beyond the federal Recommended Uniform Screening Panel, aiming to reduce diagnostic odysseys and promote equity of access. However, unresolved issues persist, including informed consent, return of uncertain or adult-onset findings, and data stewardship. For nurse practitioners, the integration of genomic sequencing into NBS underscores new roles in parental education, consent counseling, care coordination, and long-term follow-up. Ensuring equity, cultural sensitivity, and alignment with professional guidelines will be essential to implementation. WGS in NBS has the potential to improve outcomes for children and families by enabling earlier diagnosis and intervention. Nurse practitioners, as frontline providers in primary care and pediatrics, are uniquely positioned to support families through genomic education, ethical decision making, and care navigation. Building workforce genomic competency and advocating for equitable policies are critical to realizing the promise of genomic NBS in clinical practice.
- Research Article
- 10.1016/j.onehlt.2026.101348
- Jan 31, 2026
- One Health
- Zhifang Zhang + 5 more
China's foodborne disease (FBD) surveillance system was implemented later than those in most developed countries. However, in the past 32 years, it has undergone improvements: from pilot projects to full coverage; from a passive mode to an active one; from localized development to the integration of international standards; and from a single function to a comprehensive system.During this process, China's health administrative departments have adjusted their regulatory departments and functions for FBDs in response to evolving epidemiological patterns of FBD. Simultaneously, they have used a phased, step-by-step approach to promote the use of whole genome sequencing (WGS) technology, according to the level of regional economic development, to facilitate FBD traceability. However, the system must be further improved in terms of traceability capabilities, multi-departmental collaboration, and data sharing mechanisms. At various historical stages, FBD outbreaks in China have shown distinct regional characteristics, and the spectrum of common pathogenic bacteria in China differs from those in the Europe Union (EU) and the United States. In China, diseases caused by microorganisms such as Vibrio parahaemolyticus, Salmonella, Staphylococcus aureus, Bacillus cereus, and Echerichia coli are dominant. Since 2011, the number of FBD outbreaks has increased each year, and has been accompanied by a decline in the case fatality rate, thus reflecting improvements in foodborne detection technology in China. In the future, further integration of advanced technologies such as WGS will be necessary to enhance surveillance sensitivity, strengthen active and targeted surveillance of key populations, and establish a risk warning model tailored to China's dietary characteristics, thereby increasing the effectiveness of FBD prevention and control.
- Research Article
- 10.1158/1538-7445.fusionpositive26-pr003
- Jan 13, 2026
- Cancer Research
- Huibin Yang + 12 more
Abstract With the development of whole genome sequencing and CRISPR technology, it is now possible to identify sequence features that are unique to an individual’s cancer genome and to precisely target them. Such personalized precision therapy holds the promise of ushering in a new era of safe and effective cancer treatments with minimal side-effects. To specifically target the cancer genome, we have developed a CRISPR-based therapeutic approach, “KLIPP,” which is designed to target structural variants junctions (SVJs) specific to cancer genomes, with few, or no off-target effects expected in normal cell. KLIPP uses a “split enzyme” approach consisting of a dead Cas9 endonuclease (dCas9) fused to the endonuclease Fok1 where two Fok1 endonucleases need to homodimerize to become active. To “nucleate” and activate these complexes, sgRNAs are designed to bind sequences flanking cancer-specific SVJs, bringing two Fok1-dCas9 complexes together to induce double-strand breaks (DSBs). While any SVJ in the cancer genome may be targeted with KLIPP, we have found that junctions in oncogenic fusion genes represents a particularly valuable target. We show effective targeting of the EWS::FLI1 fusion oncogene in Ewing sarcoma cells using lipid nanoparticle delivery of the Fok1-dCas9 mRNA and SVJ-targeting sgRNA, leading to the induction of DSBs, diminished expression of the fusion oncogene and loss of cell survival. Other fusion oncogenes that we are currently targeting with KLIPP involve TMPSS2::ERG in prostate cancer, DNAJB1::PRKACA in fibrolamellar liver cancer and BCR::ABL1 in leukemia. We believe that KLIPP is a safe and cancer-specific approach for precision targeting of fusion oncogenes in cancer cells. This paradigm-shifting personalized therapy could revolutionize how we treat fusion-driven cancers without inflicting long-term side effects. Citation Format: Huibin Yang, Radhika Suhas. Hulbatte, Natalie Gratsch, Ann Urzynicok, Ashley Sutter, Meyer Cusnir, Mario Ashaka, Ishwarya Venkata. Narayanan, Michelle Paulsen, Anna Schwendeman, Tom E. Wilson, Erika Newman, Mats Ljungman. KLIPP: Targeting fusion oncogenes with CRISPR [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Fusion-Positive Cancer: From Discovery to Therapy; 2026 Jan 13-15; Philadelphia PA. Philadelphia (PA): AACR; Cancer Res 2026;86(1_Suppl):Abstract nr A008.
- Research Article
- 10.1038/s41598-025-28283-0
- Jan 12, 2026
- Scientific Reports
- Mark E Wadsworth + 5 more
Comprehensive genomic analysis is essential for advancing our understanding of human genetics and disease. However, short-read sequencing technologies are inherently limited in their ability to resolve highly repetitive, structurally complex, and low-mappability genomic regions, previously coined as “dark” regions. Long-read sequencing technologies, such as PacBio and Oxford Nanopore Technologies (ONT), offer improved resolution of these regions, yet they are not perfect. With the advent of the new Telomere-to-Telomere (T2T) CHM13 reference genome, exploring its effect on dark regions is prudent. In this study, we systematically analyze dark regions across four human genome references—HG19, HG38 (with and without alternate contigs), and CHM13—using both short- and long-read sequencing data. We found that dark regions increase as the reference becomes more complete, especially dark-by-MAPQ regions, but that long-read sequencing significantly reduces the number of dark regions in the genome, particularly within gene bodies. However, we identify potential alignment challenges in long-read data, such as centromeric regions. These findings highlight the importance of both reference genome selection and sequencing technology choice in achieving a truly comprehensive genomic analysis.Supplementary InformationThe online version contains supplementary material available at 10.1038/s41598-025-28283-0.
- Research Article
3
- 10.1038/s41564-025-02214-1
- Jan 7, 2026
- Nature microbiology
- Bernhard O Palsson + 2 more
Although genome sequencing technologies have advanced rapidly, microbial genomes still contain numerous genes with unknown functions, posing ongoing challenges for comprehensive genome annotation. Traditional annotation methods are constrained by a lack of scalable experimental techniques and the limitations of conventional homology-based computational approaches. Recent computational innovations, particularly deep learning, have substantially improved gene function prediction, facilitating more efficient annotation of transcription factors, enzymes and other protein classes. Integrating computational and experimental approaches has enabled the development of workflows that systematize gene function discovery, paving the way for faster, more accurate and comprehensive genome annotation. Continued refinement of these integrated methods holds great promise for deepening our understanding of microorganisms. Here we review recent advances in artificial intelligence for gene function discovery and discuss future directions for achieving interpretable and high-throughput artificial intelligence-guided annotation.
- Research Article
- 10.1016/j.critrevonc.2025.105043
- Jan 1, 2026
- Critical reviews in oncology/hematology
- Yuhao Zhao + 5 more
MSH2 in colorectal cancer: A comprehensive review of molecular mechanisms, clinical prognosis, and a precision oncology framework.
- Research Article
- 10.1007/978-1-0716-4972-5_16
- Jan 1, 2026
- Methods in molecular biology (Clifton, N.J.)
- Atilio O Rausch + 8 more
GeneDiscoveR, a novel R package, facilitates gene discovery in plant traits via comparative genomics. Despite the advancements in plant genome sequencing technologies, gene discovery in model and even more in non-model plants remains challenging. To address this gap, we introduce GeneDiscoveR, which enables the identification of orthogroups linked to specific plant traits or treatment responses. Leveraging extensive genomic data from diverse plant lineages, for instance, liverworts, we showcase its efficacy in identifying trait-specific genes. OrthoFinder defines orthologs, while GeneDiscoveR statistically detects trait-associated orthogroups. Here, we applied GeneDiscoveR to liverwort genomes to find enriched orthogroups in species with oil bodies within specialized cells or with many oil bodies in all cells. Additionally, we used it to identify OGs related to self-incompatibility from Brassicaceae genomes. This bioinformatics pipeline offers insights into plant trait genetics, aiding future gene discovery endeavors.
- Research Article
- 10.5376/gab.2026.17.0001
- Jan 1, 2026
- Genomics and Applied Biology
- Haodan Zeng + 4 more
Gene chip integrates multiple oligonucleotide sequences (probes) onto a solid-phase carrier or in a solution. Through the hybridization of probes with sample DNA and subsequent signal detection or sequence analysis, gene expression levels or genotypes can be detected. Single nucleotide polymorphisms (SNPs) are widely distributed across the genome and easily detectable, making them commonly used molecular markers for genotype detection and the development of gene chips. The development of SNP-based gene chips has gone through two stages: solid-phase and liquid-phase. Particularly since the application of high-throughput genome sequencing technology, a large number of SNPs have been identified in various crops, leading to the development of different SNP chips. These chips are widely used in variety identification, kinship analysis, genome-wide association analysis, genomic selection analysis, and other areas to assist breeding. This review introduces the detection principles related to gene chips, summarizes the SNP chips developed for different crops, and outlines the current application status of SNP chips, their existing defects and limitations, as well as future development trends. The aim is to provide a solid theoretical foundation for the optimization and innovation of gene chips in the future, promoting the continuous progress and refinement of related technologies.
- Research Article
- 10.1007/s12539-025-00792-6
- Dec 12, 2025
- Interdisciplinary sciences, computational life sciences
- Syed Abuthakir Mohamed Husain + 5 more
Burkholderia pseudomallei (BP) infections claims tens of thousands of lives worldwide every year. The bacterium's distinctive characteristics include antibiotic resistance, virulence and ability to survive in stressful environments. The B. pseudomallei genome sequencing and annotation reveal that about 25% of the genes encode hypothetical proteins (HPs). As such, characterising the HPs could shed light on the mechanisms that contribute to the above characteristics. Over the last decade, genome sequencing and annotation technologies have advanced drastically. Furthermore, artificial intelligence programs such as AlphaFold2 (AF2), RoseTTAFold2 (RF2), which can predict 3D protein structures with high accuracy, are also available. Taking advantage of the available tools, this study aimed to re-annotate HPs that are encoded within the BP genome. To achieve this, we retrieved 1869 HPs from the Burkholderia Genome Database, then cross-referenced with UniProt. After filtering, 419 remain hypothetical. These were analysed using BLASTp for sequence homologs and antibiotic resistance proteins, followed by 3D structure prediction using AF2 and RF2, and structural homolog search using Foldseek. This study successfully annotated 209 HPs with only 210 proteins (3.7% of BP coding sequences) still classified as 'hypothetical'. The functions of the predicted HPs were further analysed using structure comparison and active site analysis. The annotated protein list includes fifteen antibiotic resistance proteins, five haem oxygenase-like fold proteins involved in biofilm formation, host pathogenesis, and antibacterial activity, along with five essential proteins. These proteins represent promising drug targets for developing new antibiotics against melioidosis. Nonetheless, experimental validation will be necessary to characterize the predicted protein functions.