Abstract

The implementation of statistical methods that can make the similarity study process more efficient has been very common in current research, because some methods such as wavelets make the processing of data manipulation faster and with the ability to process thousands of nucleotides in a genomic sequencing, and consequently better estimates are obtained. Thus, this work aimed to analyze some genomes of the Coronaviridae family and the Paramyxoviridae family, looking for similarities through signal decomposition considering the non-decimated discrete wavelet transform techniques. The entire genome sequences of the families were initially extracted from the NCBI website and then pattern analysis was performed using a signal processing method, based on the GC content. For each GC sequence, decomposition was performed using the six-level discrete non-decimated Daubechies wavelet transform. Then, the Hurst exponent was calculated for each level of decomposition and the sequences were verified with similar patterns. After the cluster analysis, it was possible to obtain that the pair Bet1 and MERS proved to be similar in almost all methods, the other pair of sequences that proved to be similar by the absolute moments methods and by the aggregated variance method was the pair Gamma1 and Del, the group composed of Influ1, Influ4, Influ3 and Hendra was similar by the absolute moments methods, aggregated variance, and in the R/S analysis method, there was a substitution of Influ4 by Influ5. At the end of the study, it was possible to conclude that the non-decimated discrete wavelet transform allowed us to decompose each one of the sequences and, consequently, a more detailed study of similarity can be carried out, and it was possible to obtain, by some methods, strains of Coronaviridae that do not resemble the strains of Paramyxoviridae.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call