Application of Graph Entropy in CRISPR and Repeats Detection in DNA Sequences

Dipendra C Sengupta,Jharna D Sengupta

doi:10.4236/cmb.2016.63004

Dipendra C Sengupta, Jharna D Sengupta

Open Access

https://doi.org/10.4236/cmb.2016.63004

Copy DOI

Abstract

We analyzed DNA sequences using a new measure of entropy. The general aim was to analyze DNA sequences and find interesting sections of a genome using a new formulation of Shannon like entropy. We developed this new measure of entropy for any non-trivial graph or, more broadly, for any square matrix whose non-zero elements represent probabilistic weights assigned to connections or transitions between pairs of vertices. The new measure is called the graph entropy and it quantifies the aggregate indeterminacy effected by the variety of unique walks that exist between each pair of vertices. The new tool is shown to be uniquely capable of revealing CRISPR regions in bacterial genomes and to identify Tandem repeats and Direct repeats of genome. We have done experiment on 26 species and found many tandem repeats and direct repeats (CRISPR for bacteria or archaea). There are several existing separate CRISPR or Tandem finder tools but our entropy can find both of these features if present in genome.

Highlights

Deciphering the enormously long nucleotide sequences that are being uncovered in the human genome is one of the major challenges in our days
The new measure is called the graph entropy and it quantifies the aggregate indeterminacy effected by the variety of unique walks that exist between each pair of vertices
The new tool is shown to be uniquely capable of revealing Clustered Regularly Inter Spaced Palindromic Repeats (CRISPRs) regions in bacterial genomes and to identify Tandem repeats and Direct repeats of genome

Summary

Introduction

Deciphering the enormously long nucleotide sequences that are being uncovered in the human genome is one of the major challenges in our days. Along with serious ethical issues, we encounter a series of tremendously hard scientific problems These problems mainly arise from the fact that sequencing techniques are almost completely automatic controlled the analysis of the sequenced data is not. D. Sengupta 42 ciple, biochemical methods are able to do this job, but since they are extremely expensive and time consuming, there is a high demand for alternative approaches to extract the information hidden in genome [1]. Sengupta 42 ciple, biochemical methods are able to do this job, but since they are extremely expensive and time consuming, there is a high demand for alternative approaches to extract the information hidden in genome [1] In this situation, concepts and techniques from information theory turned out to be welcoming tools to handle the problem of extracting valuable information from biosequences such as DNA, RNA, or amino acid chains.

Objectives

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Molecular Bioscience	Publication Date: Jan 1, 2016
Citations: 16	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Application of Graph Entropy in CRISPR and Repeats Detection in DNA Sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Molecular Bioscience

Lead the way for us

Similar Papers

Tandem repeats in giant archaeal Borg elements undergo rapid evolution and create new intrinsically disordered regions in proteins.
Marie Charlotte Schoelmerich ... Harmit S Malik
PLOS Biology | VOL. 21
Marie Charlotte Schoelmerich, et. al.Marie Charlotte Schoelmerich ... Harmit S Malik
26 Jan 2023
PLOS Biology | VOL. 21

Induction of Expression Instability of the nptII Gene in Transgenic Tobacco Plants
T V Novoselya ... E V Deineko
-
T V Novoselya, et. al.T V Novoselya ... E V Deineko
01 Jan 2002
01 Jan 2002

Nucleotide sequence of cloned unintegrated avian sarcoma virus DNA: viral DNA contains direct and inverted repeats similar to those in transposable elements.
R Swanstrom ... W J Delorbe
Proceedings of the National Academy of Sciences | VOL. 78
R Swanstrom, et. al.R Swanstrom ... W J Delorbe
01 Jan 1981
Proceedings of the National Academy of Sciences | VOL. 78

Editorial: Z-curve Applications in Genome Analysis.
Chun-Ting Zhang
Current genomics | VOL. 15
Chun-Ting ZhangChun-Ting Zhang
01 Apr 2014
Current genomics | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Application of Graph Entropy in CRISPR and Repeats Detection in DNA Sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Molecular Bioscience