A normalized cDNA library was constructed from the adductor muscle of M. yessoensis and acquired 4595 high quality expressed sequence tags (ESTs). After clustering and assembly of the ESTs, 3061 unigenes containing 654 contigs and 2407 singletons were identified. The contig length ranged from 266 bp to 2364 bp and the average length of these contigs was 544 bp. Blastx nonredundant protein database analysis showed that 1522 unigenes had significant homology to known genes (E value ≤ 10 − 5 ). By comparing to Clusters of Orthologous Groups (COG) categories, 460 unigenes were annotated (E value ≤ 10 − 10 ). Using Kyoto Encyclopedia of Genes and Genomes (KEGG), 345 of 3061 unigenes were assigned into 103 pathways (E value ≤ 10 − 5 ). For InterProScan searches, 1237 unigenes were annotated containing 727 different types of protein domains. 941 of the 1237 unigenes were annotated for Gene Ontology (GO) classification using Uniprot2GO associations in any category (biological, cellular, and molecular). By sequences comparability and analysis of Blastx NCBI nonredundant protein database and KEGG, 66 unigenes were identified that may be involved in genetic information processing based on the known knowledge. The study provides a material basis as useful information for the genomic analysis of shellfish.
Read full abstract