Abstract

SIMSApiper is a Nextflow pipeline that creates reliable, structure-informed MSAs of thousands of protein sequences faster than standard structure-based alignment methods. Structural information can be provided by the user or collected by the pipeline from online resources. Parallelization with sequence identity-based subsets can be activated to significantly speed up the alignment process. Finally, the number of gaps in the final alignment can be reduced by leveraging the position of conserved secondary structure elements. The pipeline is implemented using Nextflow, Python3, and Bash. It is publicly available on github.com/Bio2Byte/simsapiper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call