Abstract

Genomes like Glycine max (soybean) that have been highly conserved following increases in ploidy (by duplication or hybridization) present challenges for bioinformatics and genome analysis. At http://soybeangenome.siu.edu the Soybean Genome Database (SoyGD) genome browser has, since 2002, integrated and served the publicly available soybean physical map, BAC fingerprint database and genetic map associated genomic data (1). Duplicated regions have been identified and catalogued with a-d suffix to marker anchor names and contig names that communicate ploidy (ctg>8000 are tetraploid, ctg>9000 are octoploid). DNA sequence data has been used to separate DNA marker anchors from homologs of DNA marker anchors in BAC pools. About 200 gene families were mapped by EST hybridization. About 23,000 minimum tiling path (MTP) BIBAC clones provided BAC end sequences (BES) to decorate the physical map and were added to the database as separate tracks. Predicted gene models were developed for about 15% of the BES. From these models candidate genes underlying disease resistance, seed yield and seed protein, oil or isoflavone content were detected and fine-mapped. In recent additions 1 Gbp of genome sequence was made available in about 1500 scaffolds by DOE. Methods for display were improved by cross-referencing the BES and WGS with Arabidopsis (3). In genome evolution analyses more than a thousand additional microsatellite marker anchors were developed for contigs, 353 on the map and about 700 still in new microsatellite markers on the genetic map with contigs and associated features. About half of the markers mapped to regions of the genome that formed gaps in earlier maps suggesting marker clustering biases. SoyGD represents the new build 5 for the physical map with 800 contigs from the 76,749 fingerprinted clones publicly available. New QTL data has been incorporated from the newly release 'Essex' by 'Forrest' and 'Flyer' by 'Hartwig' RIL populations. Gene expression data has been added to the gene models represented in SoyGD. This work was supported by NSF project #9878635 and USB 2218-6218.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call