Public databases contain large datasets of plant expressed sequence tags (ESTs) that can be used for mining microsatellite/simple sequence repeat markers. The identification and annotation of these markers take considerable time. Here, we describe an efficient, high-throughput microsatellite mining, and analysis pipeline, standalone EST microsatellite mining and analysis tool (SEMAT). The pipeline bundles sequence trimming, assembly, microsatellite identification, primer selection, and blast annotation, for which it consecutively uses SeqClean, CAP3, MISA, Primer3, and Blast. SEMAT is written using Perl scripts, and it runs under Ubuntu and Fedora Linux. SEMAT is an efficient and time-saving bioinformatics tool to accomplish the high throughput EST-SSR analysis. It is freely available from http://semat.cpcribioinformatics.in/.
Read full abstract