Abstract
Introduction: Essential genes are essential for the survival of various species. These genes are a family linked to critical cellular activities for species survival. These genes are coded for proteins that regulate central metabolism, gene translation, deoxyribonucleic acid replication, and fundamental cellular structure and facilitate intracellular and extracellular transport. Essential genes preserve crucial genomics information that may hold the key to a detailed knowledge of life and evolution. Essential gene studies have long been regarded as a vital topic in computational biology due to their relevance. An essential gene is composed of adenine, guanine, cytosine, and thymine and its various combinations. Methods: This paper presents a novel method of extracting information on the stationary patterns of nucleotides such as adenine, guanine, cytosine, and thymine in each gene. For this purpose, some co-occurrence matrices are derived that provide the statistical distribution of stationary patterns of nucleotides in the genes, which is helpful in establishing the relationship between the nucleotides. For extracting discriminant features from each co-occurrence matrix, energy, entropy, homogeneity, contrast, and dissimilarity features are computed, which are extracted from all co-occurrence matrices and then concatenated to form a feature vector representing each essential gene. Finally, supervised machine learning algorithms are applied for essential gene classification based on the extracted fixed-dimensional feature vectors. Results: For comparison, some existing state-of-the-art feature representation techniques such as Shannon entropy (SE), Hurst exponent (HE), fractal dimension (FD), and their combinations have been utilized. Discussion: An extensive experiment has been performed for classifying the essential genes of five species that show the robustness and effectiveness of the proposed methodology.
Full Text
Topics from this Paper
Essential Genes
Essential Gene
Co-occurrence Matrices
Feature Representation Techniques
Supervised Machine Learning Algorithms
+ Show 5 more
Create a personalized feed of these topics
Get StartedSimilar Papers
Kidney International
Nov 1, 2017
Gene
Nov 1, 2014
Molecular Systems Biology
Jun 1, 2021
Molecular Therapy - Nucleic Acids
Jun 1, 2021
iScience
Oct 1, 2021
BMC Infectious Diseases
May 20, 2013
Biomolecules
Dec 27, 2011
Computational Biology and Chemistry
Jun 1, 2014
Molecular BioSystems
Apr 1, 2012
PLOS ONE
Nov 30, 2020
Jan 1, 2020
Nov 12, 2021
Nature Genetics
Sep 11, 2005
IOP Conference Series: Earth and Environmental Science
Feb 1, 2021
Frontiers in Genetics
Frontiers in Genetics
Sep 19, 2023
Frontiers in Genetics
Sep 19, 2023
Frontiers in Genetics
Sep 19, 2023
Frontiers in Genetics
Sep 19, 2023
Frontiers in Genetics
Sep 18, 2023
Frontiers in Genetics
Sep 18, 2023
Frontiers in Genetics
Sep 18, 2023
Frontiers in Genetics
Sep 18, 2023
Frontiers in Genetics
Sep 18, 2023
Frontiers in Genetics
Sep 18, 2023