Genetic Similarity Analysis Based on Positive and Negative Sequence Patterns of DNA

Yue Lu,Long Zhao,Xiangjun Dong,Zhao Li

doi:10.3390/sym12122090

Yue Lu, Long Zhao + Show 2 more

Open Access

https://doi.org/10.3390/sym12122090

Copy DOI

Abstract

Similarity analysis of DNA sequences can clarify the homology between sequences and predict the structure of, and relationship between, them. At the same time, the frequent patterns of biological sequences explain not only the genetic characteristics of the organism, but they also serve as relevant markers for certain events of biological sequences. However, most of the aforementioned biological sequence similarity analysis methods are targeted at the entire sequential pattern, which ignores the missing gene fragment that may induce potential disease. The similarity analysis of such sequences containing a missing gene item is a blank. Consequently, some sequences with missing bases are ignored or not effectively analyzed. Thus, this paper presents a new method for DNA sequence similarity analysis. Using this method, we first mined not only positive sequential patterns, but also sequential patterns that were missing some of the base terms (collectively referred to as negative sequential patterns). Subsequently, we used these frequent patterns for similarity analysis on a two-dimensional plane. Several experiments were conducted in order to verify the effectiveness of this algorithm. The experimental results demonstrated that the algorithm can obtain various results through the selection of frequent sequential patterns and that accuracy and time efficiency was improved.

Highlights

In recent years, a large volume of biological sequence data has been generated
Because the DNA sequence corresponds to its time series one to one, the similarity of the DNA
We compared the results of the frequent patterns mining of the first exon of the β-protein gene of the 10 different species based on our proposed graphical representation

Summary

Introduction

When a new DNA sequence is obtained, similarity analysis is used in order to determine whether it is similar to a known sequence. If it is homologous, this will save time and effort in re-determining the function of the new sequence. Similarity analysis of biological sequences is by no means a straightforward mechanical comparison. Alignment and classical research methods are the most common. Two problems exist that directly affect the similarity score: the substitution matrix and gap penalty. Gap penalty is used to compensate the influence of insertion and deletion on sequence similarity and no suitable theoretical model exists to describe the slot problem. Vacancy penalty points lack a functional theoretical basis and are subjectivity

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Dec 16, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Genetic Similarity Analysis Based on Positive and Negative Sequence Patterns of DNA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Mining Top- k Useful Negative Sequential Patterns via Learning.
Xiangjun Dong ... Longbing Cao
IEEE Transactions on Neural Networks and Learning Systems | VOL. 30
Xiangjun Dong, et. al.Xiangjun Dong ... Longbing Cao
10 Jan 2019
IEEE Transactions on Neural Networks and Learning Systems | VOL. 30

E-NSPFI: Efficient Mining Negative Sequential Pattern from Both Frequent and Infrequent Positive Sequential Patterns
Yongshun Gong ... Guohua Lv
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 31
Yongshun Gong, et. al.Yongshun Gong ... Guohua Lv
12 Jan 2017
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 31

SAPNSP: Select actionable positive and negative sequential patterns based on a contribution metric
Chuanlu Liu ... Yan Li
-
Chuanlu Liu, et. al. Chuanlu Liu ... Yan Li
01 Aug 2015
01 Aug 2015

Select actionable positive or negative sequential patterns
Xiangjun Dong ... Tiantian Xu
Journal of Intelligent & Fuzzy Systems | VOL. 29
Xiangjun Dong, et. al.Xiangjun Dong ... Tiantian Xu
21 Nov 2015
Journal of Intelligent & Fuzzy Systems | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Genetic Similarity Analysis Based on Positive and Negative Sequence Patterns of DNA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry