Abstract

Models of sequence evolution typically assume that all sequences are possible. However, restriction enzymes that cut DNA at specific recognition sites provide an example where carrying a recognition site can be lethal. Motivated by this observation, we studied the set of strings over a finite alphabet with taboos, that is, with prohibited substrings. The taboo-set is referred to as mathbb {T} and any allowed string as a taboo-free string. We consider the so-called Hamming graph varGamma _n(mathbb {T}), whose vertices are taboo-free strings of length n and whose edges connect two taboo-free strings if their Hamming distance equals one. Any (random) walk on this graph describes the evolution of a DNA sequence that avoids taboos. We describe the construction of the vertex set of varGamma _n(mathbb {T}). Then we state conditions under which varGamma _n(mathbb {T}) and its suffix subgraphs are connected. Moreover, we provide an algorithm that determines if all these graphs are connected for an arbitrary mathbb {T}. As an application of the algorithm, we show that about 87% of bacteria listed in REBASE have a taboo-set that induces connected taboo-free Hamming graphs, because they have less than four type II restriction enzymes. On the other hand, four properly chosen taboos are enough to disconnect one suffix subgraph, and consequently connectivity of taboo-free Hamming graphs could change depending on the composition of restriction sites.

Highlights

  • In bacteria, restriction enzymes cleave foreign DNA to stop its propagation

  • At least 90% (139/153) of archea in REBASE (2020b) induce connected taboo-free Hamming graphs, because they have less than four type II restriction enzymes

  • The connectivity of the taboo-free Hamming graphs induced by the restriction enzymes of the bacteria listed in REBASE could be quickly analysed with our tools

Read more

Summary

Introduction

Restriction enzymes cleave foreign DNA to stop its propagation. To do so, a double-stranded cut is induced by a so-called recognition site, a DNA sequence of length 4–8 base pairs (Alberts et al 2004). The avoidance of the recognition sites is evolutionary advantageous (Rocha et al 2001), mainly for non-temperate bacteriophages affected by orthodox type II R–M systems (Rusinov et al 2018a) In those instances the recognition site is, as we call it, a taboo for host and foreign DNA. To initiate models of sequence evolution with taboos, we studied the Hamming graph Γn(T), whose vertices are strings of length n over a finite alphabet Σ not containing any taboos of the set T as subsequence. Given a taboo-set T, if for every taboo-free string s and integer n the Hamming graph Γns(T) is connected, evolution can explore the space of taboo-free sequences by simple point mutation, no matter which DNA suffix fragments remain invariable, as long as the taboo-set T does not change in the course of evolution

Motivating examples and non-technical presentation of key results
Outline
Strings
Graph theory
Properties of taboo-sets
Prefixes and suffixes of a taboo-free string
Isomorphisms between taboo-free Hamming graphs
Connectivity of taboo-free Hamming graphs
Examples of plausible bacterial taboo-sets
A frequent case
Helicobacter pylori
An imaginary bacterium
10 Concluding remarks
Findings
Compliance with ethical standards

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.