Evaluating the Performance of Multiple Sequence Alignment Programs with Application to Genotyping SARS-CoV-2 in the Saudi Population

Aminah Alqahtani,Meznah Almutairy

doi:10.3390/computation11110212

Aminah Alqahtani, Meznah Almutairy

https://doi.org/10.3390/computation11110212

Copy DOI

Export

Save

Cite

Journal: Computation	Publication Date: Nov 1, 2023
Citations: 2	License type: CC BY 4.0

Affiliation: Imam Mohammad ibn Saud Islamic University

Abstract
Full-Text
Similar Papers

Abstract

Listen

This study explores the accuracy and efficiency of multiple sequence alignment (MSA) programs, focusing on ClustalΩ, MAFFT, and MUSCLE in the context of genotyping SARS-CoV-2 for the Saudi population. Our results indicate that MAFFT outperforms the others, making it an ideal choice for large-scale genomic analyses. The comparative performance of MSAs assembled using MergeAlign demonstrates that MAFFT and MUSCLE consistently exhibit higher accuracy than ClustalΩ in both reference-based and consensus-based approaches. The evaluation of genotyping effectiveness reveals that the addition of a reference sequence, such as the SARS-CoV-2 Wuhan-Hu-1 isolate, does not significantly affect the alignment process, suggesting that using consensus sequences derived from individual MSA alignments may yield comparable genotyping outcomes. Investigating single-nucleotide polymorphisms (SNPs) and mutations highlights distinctive features of MSA programs. ClustalΩ and MAFFT show similar counts, while MUSCLE displays the highest SNP count. High-frequency SNP analysis identifies MAFFT as the most accurate MSA program, emphasizing its reliability. Comparisons between Saudi and global SARS-CoV-2 populations underscore regional genetic variations. Saudis exhibit consistently higher frequencies of high-frequency SNPs, attributed to genetic similarity within the population. Transmission dynamics analysis reveals a higher frequency of co-mutations in the Saudi dataset, suggesting shared evolutionary patterns. These findings emphasize the importance of considering regional diversity in genetic analyses.

Full Text