Sequence Alignment Algorithm Research Articles

This paper introduces a malware detection system for smartphones based on studying the dynamic behavior of suspicious applications. The main goal is to prevent the installation of the malicious software on the victim systems. The approach focuses on identifying malware addressed against the Android platform. For that purpose, only the system calls performed during the boot process of the recently installed applications are studied. Thereby the amount of information to be considered is reduced, since only activities related with their initialization are taken into account. The proposal defines a pattern recognition system with three processing layers: monitoring, analysis and decision-making. First, in order to extract the sequences of system calls, the potentially compromised applications are executed on a safe and isolated environment. Then the analysis step generates the metrics required for decision-making. This level combines sequence alignment algorithms with bagging, which allow scoring the similarity between the extracted sequences considering their regions of greatest resemblance. At the decision-making stage, the Wilcoxon signed-rank test is implemented, which determines if the new software is labeled as legitimate or malicious. The proposal has been tested in different experiments that include an in-depth study of a particular use case, and the evaluation of its effectiveness when analyzing samples of well-known public datasets. Promising experimental results have been shown, hence demonstrating that the approach is a good complement to the strategies of the bibliography.

Read full abstract

The latest sequencing technologies such as the Pacific Biosciences (PacBio) and Oxford Nanopore machines can generate long reads at the length of thousands of nucleic bases which is much longer than the reads at the length of hundreds generated by Illumina machines. However, these long reads are prone to much higher error rates, for example 15%, making downstream analysis and applications very difficult. Error correction is a process to improve the quality of sequencing data. Hybrid correction strategies have been recently proposed to combine Illumina reads of low error rates to fix sequencing errors in the noisy long reads with good performance. In this paper, we propose a new method named Bicolor, a bi-level framework of hybrid error correction for further improving the quality of PacBio long reads. At the first level, our method uses a de Bruijn graph-based error correction idea to search paths in pairs of solid -mers iteratively with an increasing length of -mer. At the second level, we combine the processed results under different parameters from the first level. In particular, a multiple sequence alignment algorithm is used to align those similar long reads, followed by a voting algorithm which determines the final base at each position of the reads. We compare the superior performance of Bicolor with three state-of-the-art methods on three real data sets. Results demonstrate that Bicolor always achieves the highest identity ratio. Bicolor also achieves a higher alignment ratio () and a higher number of aligned reads than the current methods on two data sets. On the third data set, our method is closely competitive to the current methods in terms of number of aligned reads and genome coverage. The C++ source codes of our algorithm are freely available at https://github.com/yuansliu/Bicolor.

Read full abstract

Sequence Alignment Algorithm Research Articles

Related Topics

Articles published on Sequence Alignment Algorithm

A New Numerical Method for DNA Sequence Analysis Based on 8-Dimensional Vector Representation

Implementation of Hybrid Alignment Algorithm for Protein Database Search on the SW26010 Many-Core Processor

Opti-SW: An improved gene sequence alignment algorithm

Dynamics based clustering of globin family members.

A Way to Improve the Key Recovery Accuracy Based on Dynamic Programming

BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures

Identification of Protein Homologs and Domain Boundaries by Iterative Sequence Alignment.

SOGA: space oriented genetic algorithm for multiple sequence alignment

A Survey of the State-of-the-Art Parallel Multiple Sequence Alignment Algorithms on Multicore Systems

Freiburg RNA tools: a central online resource for RNA-focused research and teaching.

A novel pattern recognition system for detecting Android malware by analyzing suspicious boot sequences

Improving the performance of Smith–Waterman sequence algorithm on GPU using shared memory for biological protein sequences

An accurate algorithm for multiple sequence alignment in MapReduce

JABAWS 2.2 distributed web services for Bioinformatics: protein disorder, conservation and RNA secondary structure.

Aligning the large-scale ontologies on schema-level for weaving Chinese linked open data

EGSA: a new enhanced gravitational search algorithm to resolve multiple sequence alignment problem

Analyzing Glycan-Binding Profiles Using Weighted Multiple Alignment of Trees.

EGSA: a new enhanced gravitational search algorithm to resolve multiple sequence alignment problem

Pro-malign: Multiple Sequence Alignment Algorithm using Approached Profile

Bi-level error correction for PacBio long reads.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Sequence Alignment Algorithm Research Articles

Related Topics

Articles published on Sequence Alignment Algorithm

A New Numerical Method for DNA Sequence Analysis Based on 8-Dimensional Vector Representation

Implementation of Hybrid Alignment Algorithm for Protein Database Search on the SW26010 Many-Core Processor

Opti-SW: An improved gene sequence alignment algorithm

Dynamics based clustering of globin family members.

A Way to Improve the Key Recovery Accuracy Based on Dynamic Programming

BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures

Identification of Protein Homologs and Domain Boundaries by Iterative Sequence Alignment.

SOGA: space oriented genetic algorithm for multiple sequence alignment

A Survey of the State-of-the-Art Parallel Multiple Sequence Alignment Algorithms on Multicore Systems

Freiburg RNA tools: a central online resource for RNA-focused research and teaching.

A novel pattern recognition system for detecting Android malware by analyzing suspicious boot sequences

Improving the performance of Smith–Waterman sequence algorithm on GPU using shared memory for biological protein sequences

An accurate algorithm for multiple sequence alignment in MapReduce

JABAWS 2.2 distributed web services for Bioinformatics: protein disorder, conservation and RNA secondary structure.

Aligning the large-scale ontologies on schema-level for weaving Chinese linked open data

EGSA: a new enhanced gravitational search algorithm to resolve multiple sequence alignment problem

Analyzing Glycan-Binding Profiles Using Weighted Multiple Alignment of Trees.

EGSA: a new enhanced gravitational search algorithm to resolve multiple sequence alignment problem

Pro-malign: Multiple Sequence Alignment Algorithm using Approached Profile

Bi-level error correction for PacBio long reads.