Abstract

ConFind (conserved region finder) identifies regions of conservation in multiple sequence alignments that can serve as diagnostic targets. Designed to work with a large number of closely related, highly variable sequences, ConFind provides robust handling of alignments containing partial sequences and ambiguous characters. Conserved regions are defined in terms of minimum region length, maximum informational entropy (variability) per position, number of exceptions allowed to the maximum entropy criterion and the minimum number of sequences that must contain a non-ambiguous character at a position to be considered for inclusion in a conserved region. Comparison of the calculated entropy for an alignment of 95 influenza A hemagglutinin sequences with random deletions results in a 98% reduction in the average error in ConFind relative to the 'Find Conserved Regions' option in BioEdit. ConFind requires Python 2.3, but Python 2.4 or an upgrade of the optparse module to Optik 1.5 is suggested. The program is known to run under Linux and DOS.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call