Abstract

Genomics has become a ground-breaking field in all areas of the life sciences. The advanced genomics and the development of high-throughput techniques have lately provided insight into whole-genome characterization of a wide range of organisms. In the post-genomic era, new technologies have revealed an outbreak of prerequisite genomic sequences and supporting data to understand genome wide functional regulation of gene expression and metabolic pathways reconstruction. However, the availability of this plethora of genomic data presents a significant challenge for storage, analyses and data management. Analysis of this mega-data requires the development and application of novel bioinformatics tools that must include unified functional annotation, structural search, and comprehensive analysis and identification of new genes in a wide range of species with fully sequenced genomes. In addition, generation of systematically and syntactically unambiguous nomenclature systems for genomic data across species is a crucial task. Such systems are necessary for adequate handling genetic information in the context of comparative functional genomics. In this paper, we provide an overview of major advances in bioinformatics and computational biology in genome sequencing and next-generation sequence data analysis. We focus on their potential applications for efficient collection, storage, and analysis of genetic data/information from a wide range of gene banks. We also discuss the importance of establishing a unified nomenclature system through a functional and structural genomics approach.

Highlights

  • Information processing by bioinformatics tools and computational biology methods has become essential for solving complex biological problems in genomics, proteomics, and metabolomics

  • We provide an overview of major advances in bioinformatics and computational biology in genome sequencing and next-generation sequence data analysis

  • We focus on their potential applications for efficient collection, storage, and analysis of genetic data/information from a wide range of gene banks

Read more

Summary

INTRODUCTION

Information processing by bioinformatics tools and computational biology methods has become essential for solving complex biological problems in genomics, proteomics, and metabolomics. As acquisition of genomic data becomes increasingly cost-efficient, genomic data sets are accumulating at an exponential rate and new types of genetic data are emerging These come with the inherent challenges of new methods of statistical analysis and modeling. New technologies are producing data at a rate that outpaces our ability to analyze its biological meaning Researchers are addressing this challenge by adopting mathematical and statistical software, computer modeling, and other computational and engineering methods. The pyrosequencing method can sequence a microbial genome in one hour [7,8,9] These improved technologies deploy random fragmentation of the nucleotide sequence of interest in order to increase throughput by simultaneously sequencing millions of fragments. The combination of more than one platform is potentially more cost effective and could yield higher fidelity and accuracy [16,17]

Pre-Analysis and Processing of Sequencing Data
Genomic Annotation
Solving the Problem
Data Analysis Pathways and Tools
Genomics
DATABASES
Bioinformatics Tools to Retrieve Biological Data
Immunoinformatics Data
DATA INTEGRATION IN BIOINFORMATICS
Data Warehousing
The PIR-PSD
Prosite
5.10. InterPro
5.11. UniProt
5.14. ELIXIR
NOMENCLATURES AND NAMING SYSTEMS
Automated Functional Annotation
Manual Curation
Challenges Ahead
Findings
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call