Abstract

AppleScript is a Macintosh scripting language. This note describes AppieScripts that automatically mail sequences to a BLAST server. The existence of efficient methods for DNA sequence generation has led to the need for substantial sequence analysis, even for those labs not directly involved in genome-scale sequencing. This analysis can be effectively accomplished by delivering sequences via email to powerful server computers. Software tools for automated email delivery have existed for the most part only on platforms using varieties of the Unix operating system. Unfortunately, many laboratories use Macintosh computers exclusively. However, Macintosh computers now come packaged with AppleScript, which is a simple scripting language that can direct the operations of scriptable' programs. I have written free software for Macintosh computers in AppleScript which automates the delivery of DNA or protein sequences to the BLAST (Altschul el al., 1990) server at ncbi.nlm.nih.gov. These AppieScripts make use of the scriptability' of the commercial email program Eudora (Qualcomm Inc., eudora-sales@qualcomm.com). The sets of scripts are entitled Seq-EudoraBlast' and Automatic-BLAST', and are available through the Internet at ftp://fly.bio.indiana.edu/molbio/mac and at the fly.bio.indiana.edu mirror sites. Although there are three types of scripts (the CLIPBOARD scripts and the droplet' scripts, contained in SeqEudora-Blast', and the Automatic-BLAST' script), they all operate similarly: they receive a sequence or sequences, strip out descriptive text, launch Eudora, read essential parameters from an accompanying BLAST parameters' text file, then compose a properly addressed letter with the parameters and a sequence, and send the letter or letters to the BLAST server. The user stipulates which of the five BLAST programs (blastn, blastp, blastx, tblastn, tblastx) should be used by selecting a particular script when using the CLIPBOARD or droplet' scripts. One set of scripts (CLIPBOARD) collects the contents of the Clipboard, and can deliver only one sequence at a time. One simply double-clicks on the desired script to activate it. The droplet' scripts take drag-and-dropped text files, and thus can deliver a large number of sequences at one time. The Automatic-BLAST script is fully automatic. At a daily or weekly time set by the researcher, the script searches the contents of specified folders (e.g. daily blastn', weekly tblastx'). If the script finds text files, it reads their contents and sends those sequences to the server. Note that this means the Automatic-BLAST script must be kept running in order for it to periodically check for the specified time or day. The Automatic-BLAST and droplet' scripts also record their submissions cumulatively in a log file, which could be useful given the large number of sequences that can be delivered by these scripts. The droplet' and Automatic-BLAST scripts can parse DNA sequences in plain, DNA Strider ASCII, Pearson/Fasta, Genbank/GB, EMBL or Zuker format. The result from a BLAST analysis can be sizeable, and large results are split into smaller messages for delivery to the researcher. Both sets of scripts contain BLASTextract', a script to extract BLAST messages from a Eudora mailbox' to text files, concatenating smaller messages to form a single file when necessary. There are some known deficiencies in these scripts. First, they cannot accommodate every known sequence file format. Second, the scripts cannot, as yet, accommodate multiple sequences in one file. Third, the scripts appear to run out of memory if a given sequence exceeds 4000 characters in length. This may be remedied by increasing the script's Preferred size' using the script's Get Info' box. Finally, these scripts require recent versions of AppleScript's Scripting Additions-please read the accompanying README files for the specific details. I note that there are a multitude of analytical email servers now in service for the community of molecular biologists, performing diverse functions, apart from the sequence comparison servers at ncbi.nlm.nih.gov (for a listing, see http://expasy.hcuge.ch/info/serv_ema.txt). A researcher with a beginner's knowledge of AppleScript could easily modify any of these scripts and their accompanying parameters files to automate the delivery of sequence to any of these other servers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call