Abstract

We show that the linear encoding scheme (or one-hot scheme) efficiently simulates weighted finite automata (WFA). Those automata carry weights on every transition and model substitution errors in proteic patterns. Automata with t transitions can be avantageously hardwired with 0(t) operators. This scheme solves pattern matching problems by feeding a pipeline with one character every clock cycle. Such automata are well suited for use in FPGA devices, especially within the R-disk prototype, a hardware architecture devoted to content-based searches inside non-indexed large databanks: data is filtered on-the-fly at the output of storage devices, using distributed and reconfigurable processing elements. This improves the speed of parsing genomic databanks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call