Abstract

This paper develops PeSeeD, a new metaheuristic algorithm for finding optimal spaced seed. Sequences matching is a hot topic in bio-informatics, which is used in many applications such as understanding the functional, structural, or evolutionary relationships between the sequences. The most relevant sequences matching methods are based on seeds designed to match two biological sequences. The first approach which introduced seeds was facilitated via Blastn tool, the approach builds seeds of 11 length size. However, it is clear that not all local alignments have to include an identical fragment of length 11. The spaced seeds approach is one of the methods which does not require a consecutive matching position. Dynamic programming is used to solve this kind of problem and it takes quadratic time. Several approaches have then been proposed to improve the sensitivity of searching in reasonable runtime. To reduce the complexity of such approaches, other heuristics based approaches can also be reviewed. The aim is to find spaced seeds subset which improves sensitivity without increasing the computation time. In this paper, the optimal subset spaced seeds are explored using the bio-inspired approach, penguins search optimisation algorithm (‘'PeSOA'' for short). The authors further propose an efficient heuristic for computing the overlap complexity between seeds. To evaluate the efficiency of the proposed approach, they compared the obtained results with the results of several seeds based software tools. The obtained results are very promising in terms of sensitivity and computation time for the overlap complexity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call